Doppel Studio
The notebook for non-engineers.
- Point-and-click data connections
- One-screen quality preview
- Compliance officer mode (signed reports)
- Org-wide governance and lineage
Tabular, time-series, multi-table relational, free-text, geospatial — Doppelset learns it all and synthesizes a faithful clone.
The shape of a doppel
Buy the SDK and ship in an afternoon. Bring Studio for the rest of the org. Run Vault when regulators are watching.
The notebook for non-engineers.
The Python SDK and REST API.
Self-hosted, air-gapped, regulated.
From a 200-row CSV to a 4-billion-row Parquet warehouse. From IoT to ICD-10. Doppelset learns it all.
Flat tables, wide or long. Doppelset learns mixed continuous, categorical, ordinal and high-cardinality columns simultaneously.
Star schemas, snowflakes, parent/child joins. Foreign keys, cardinalities and check constraints are preserved end-to-end.
Hourly, daily, irregular. Seasonality, drift, regime shifts and rare anomalies are modelled together — not as add-ons.
Latitudes, postal codes, polygons, trajectories. Synthesised so density and movement patterns survive, but no person does.
Short fields (addresses, ticket summaries) and longer documents (support transcripts, medical notes) — language-aware, PII-stripped.
Clickstreams, telemetry, transaction logs. Doppelset preserves session structure, dwell time, and conversion funnels.
Five layers, one job: produce statistically faithful data without ever serving a real record back to the model.
01
Doppelset inspects the schema, infers types, detects PII, and maps relationships.
02
A hybrid model (tabular transformer + diffusion) learns the joint distribution of columns and rows.
03
Differential-privacy noise is added to gradients. Memorisation tests run automatically.
04
New rows are sampled, reshaped to your schema, and streamed to your sink of choice.
05
A signed quality + privacy report is attached to every output so your auditor can verify the run.
1from doppelset import Doppelset2client = Doppelset()34# Train on a 3-table relational schema5twin = client.train(6 source={7 "customers": "postgres://prod/customers",8 "transactions": "postgres://prod/transactions",9 "tickets": "postgres://prod/tickets",10 },11 relations="auto", # learn the foreign keys12 privacy={"epsilon": 1.0}, # mathematical privacy budget13)1415# Sample a 10x scaled-up version16out = twin.sample(scale=10, balance={"is_fraud": 0.5})1718# Inspect the run19print(out.report.fidelity, out.report.epsilon)The Engine learns the joint distribution of three tables (with foreign keys) and lets you scale, balance, and re-sample in one call.
Doppelset is the only synthetic-data platform with signed quality and privacy reports, deterministic re-runs, and a tamper-evident run log. We help you say yes to data sharing — with a receipt.
Generate your first 100,000 synthetic rows in the next ten minutes. No credit card.