The product

One platform. Every kind of data.

Tabular, time-series, multi-table relational, free-text, geospatial — Doppelset learns it all and synthesizes a faithful clone.

The shape of a doppel

Three editions

One platform. Three ways to land it in your team.

Buy the SDK and ship in an afternoon. Bring Studio for the rest of the org. Run Vault when regulators are watching.

Doppel Studio

The notebook for non-engineers.

Point-and-click data connections
One-screen quality preview
Compliance officer mode (signed reports)
Org-wide governance and lineage

Doppel Engine

The Python SDK and REST API.

doppelset 0.42 — pip install doppelset
Streaming generation for billions of rows
Composable hooks for evals & guardrails
Plug into Airflow / dbt / Databricks

Doppel Vault

Self-hosted, air-gapped, regulated.

Runs on your VPC or on-prem cluster
Hardware-backed key management
BYOK / HSM / KMIP integrations
Tenants, RBAC, audit logs

Coverage

Every shape of data your team works with.

From a 200-row CSV to a 4-billion-row Parquet warehouse. From IoT to ICD-10. Doppelset learns it all.

Tabular

Flat tables, wide or long. Doppelset learns mixed continuous, categorical, ordinal and high-cardinality columns simultaneously.

CSVParquetPostgresSnowflake

Relational

Star schemas, snowflakes, parent/child joins. Foreign keys, cardinalities and check constraints are preserved end-to-end.

DDL importForeign keysCycles

Time-series

Hourly, daily, irregular. Seasonality, drift, regime shifts and rare anomalies are modelled together — not as add-ons.

ParquetKafkaInflux

Geospatial

Latitudes, postal codes, polygons, trajectories. Synthesised so density and movement patterns survive, but no person does.

GeoJSONWKTH3

Free-text

Short fields (addresses, ticket summaries) and longer documents (support transcripts, medical notes) — language-aware, PII-stripped.

32 languages

Event streams

Clickstreams, telemetry, transaction logs. Doppelset preserves session structure, dwell time, and conversion funnels.

JSON LinesAvro

Under the hood

The architecture of a Doppelset run

Five layers, one job: produce statistically faithful data without ever serving a real record back to the model.

01
Profile
Doppelset inspects the schema, infers types, detects PII, and maps relationships.
02
Train
A hybrid model (tabular transformer + diffusion) learns the joint distribution of columns and rows.
03
Govern
Differential-privacy noise is added to gradients. Memorisation tests run automatically.
04
Sample
New rows are sampled, reshaped to your schema, and streamed to your sink of choice.
05
Report
A signed quality + privacy report is attached to every output so your auditor can verify the run.

1from doppelset import Doppelset
2client = Doppelset()
3
4# Train on a 3-table relational schema
5twin = client.train(
6    source={
7        "customers":    "postgres://prod/customers",
8        "transactions": "postgres://prod/transactions",
9        "tickets":      "postgres://prod/tickets",
10    },
11    relations="auto",         # learn the foreign keys
12    privacy={"epsilon": 1.0}, # mathematical privacy budget
13)
14
15# Sample a 10x scaled-up version
16out = twin.sample(scale=10, balance={"is_fraud": 0.5})
17
18# Inspect the run
19print(out.report.fidelity, out.report.epsilon)

Real example

A three-table doppel in one notebook cell.

The Engine learns the joint distribution of three tables (with foreign keys) and lets you scale, balance, and re-sample in one call.

3 tables4 FK relationsε = 1.010× scale-upfraud-rebalanced

Try a doppel now →

regulated industries

Built for the teams that get audited.

Doppelset is the only synthetic-data platform with signed quality and privacy reports, deterministic re-runs, and a tamper-evident run log. We help you say yes to data sharing — with a receipt.

SOC 2 Type II

ISO 27001

GDPR Art. 25

HIPAA Safe Harbor

DORA-aligned

EU AI Act ready

Try it now

Ship faster. Stop arguing with legal.

Generate your first 100,000 synthetic rows in the next ten minutes. No credit card.

Open the playground →Talk to a synthesist ↗