Synthetic Data Vault

Increasingly, companies are generating large amounts of valuable customer data, but are unable to use this data to its full potential due to privacy-related considerations. The Synthetic Data Vault enables data scientists to sidestep data-sharing concerns and expand the pool of possible problem participants by generating synthetic data. By learning a generative model that accounts for dependence and relationships, the SDV creates new data that resembles the original set statistically, formally, and structurally—and therefore is easily used in its place. In our tests, data scientists using SDV data performed as well or better than data scientists using the original data in greater than 70% of cases.