18 - 09 - 2024

In the earlier articles of this series on Unified Marketing Measurement (UMM), we explored how UMM expands upon the traditional Marketing Mix Modeling (MMM) approach to address key challenges facing marketers today. This includes the declining role of cookies, the growing demand for performance evaluations, and the need for marketing budget optimization (Part 1). We also discussed how the integration of Artificial Intelligence, including Generative AI, is fueling this transformation (Part 2).

Another promising way AI can elevate UMM and extend its reach is through synthetic data.

What are Synthetic Data and Why Are They Useful for UMM?

Synthetic data are artificially created datasets. While they are not identical to real-world data, they closely resemble actual datasets and preserve their key statistical properties. Although synthetic data have long been used, particularly by researchers, advances in Generative AI have significantly improved their sophistication.

When constructing a UMM model, data serve as the foundation. High-quality, comprehensive datasets spanning multiple years and containing detailed information are essential for creating a reliable and effective measurement model. However, there are times when the available data may be insufficient.

Older data may have become irrelevant due to significant changes in the channel mix or measurement systems used. The granularity may be inadequate or only sufficient for certain aspects of the data — often those related to digital channels. Some data might simply not exist if the market or line of business being measured is new.

In these cases, synthetic data might offer an advanced solution.

Opportunities and Challenges of Integrating Synthetic Data into a UMM Model

A GenAI-based synthetic data approach for UMM can:

  • Augment Data: Generate new data while maintaining the original statistical features and incorporating domain expertise. This can be particularly useful when addressing rare events.
  • Balance Data: If the original dataset contains known biases that should not influence the analysis, it can be rebalanced using synthetic data. For example, if an attribution error causes too many conversions to be credited to the same channel.
  • Anonymize Data: Privacy, retention, and access restrictions on customer data can significantly limit analytical capabilities and revenue generation. Synthetic data addresses this challenge by offering secure datasets that are far more nuanced than those generated using traditional methods.
  • Share Data: Properly anonymized data can be shared with other businesses without compromising the business’ privacy. This form of Data Monetization enables all parties involved to leverage large, reliable datasets to enhance their models and analyses.

However, to use synthetic data effectively for a successful UMM project, certain conditions must be met, and key factors need to be taken into account:

  • Quality synthetic data require careful crafting and human intervention to effectively incorporate domain knowledge and business’ specificities.
  • There will still be cases where there are simply too few data. Adding synthetic data without a sufficient real-data base can reduce the accuracy of the results and might have other consequences that should be examined in advance and accepted as trade-offs.
  • A synthetic dataset is always created with a specific purpose in mind. It balances fidelity, utility, and privacy. It also incorporates known data and domain knowledge, for its intended use. While it can be reused and even sold, its characteristics must be well understood to maintain its value.

In the final article of this series, we will address the key question that arises from this discussion: Should your brand build its own UMM model?