Differential privacy
Add calibrated noise to gradients during training. Pick your ε per dataset; we'll report the budget consumed and warn before you exceed it.
Pick a row. Open a feature. Or skim the lot — we hide nothing, we oversell nothing. If a checkmark isn't here, the feature isn't built yet.
Add calibrated noise to gradients during training. Pick your ε per dataset; we'll report the budget consumed and warn before you exceed it.
60+ entity types across 32 languages: names, emails, IBANs, ICD-10 codes, GPS coordinates, NIE numbers… stripped before model sees them.
k-anonymity, l-diversity, and t-closeness checks run on every output. Records that look unique against background knowledge get re-sampled.
Every generation runs a battery of attribute-disclosure and membership-inference attacks. The signed report includes pass/fail per attack.
A masked-attention model that learns mixed categorical, ordinal and continuous columns jointly — no one-hot blow-up, no embedding hand-tuning.
Iterative denoising at sample time gives sharper tails and rarer regimes than GAN-only approaches. Picked automatically for numerical-heavy tables.
A second-pass model enforces foreign-key cardinality, parent/child structures, and join distributions across multi-table generations.
Captures seasonality, drift, change-points, and rare anomalies in a single state-space model. Generates irregular series too.
A single number, 0 → 100, that summarises 80+ tests: marginals, correlations, conditional distributions, downstream ML utility.
Per-column overlap plots with KL/Wasserstein/JS distances. Inline annotations explain why a column drifted.
Train-on-Synthetic, Test-on-Real benchmark across 14 model families. Pick a task — classification, regression, churn — get a number.
Every output ships with a JSON receipt cryptographically signed by Doppelset. Hand it to your auditor, paste it in your DPIA.
pip install doppelset · type-checked · async-first · streams sample() output so 4 B rows never live in RAM.
@doppelset/sdk on npm. Works in Node, Bun, Deno, and Cloudflare Workers. ESM first.
OpenAPI 3.1, OAuth 2.1, idempotency keys, retries, server-sent events for long-running jobs.
First-class connectors for Snowflake, Databricks, BigQuery, Redshift, Postgres, MySQL, S3, GCS, Airflow, dbt, Kafka, MongoDB.
Okta, Entra, Google, JumpCloud, OneLogin. Automatic provisioning, role propagation, just-in-time access.
Granular RBAC. Synth-only seats for analysts, viewer seats for compliance officers, admin for platform owners.
Every generation, every config change, every download — append-only, hash-chained, exportable to your SIEM.
EU (Madrid, Frankfurt), US (Virginia, Oregon), or your own VPC. Data never leaves the region you pick.
Generate your first 100,000 synthetic rows in the next ten minutes. No credit card.