To better understand and serve customers, organizations must not only tap more data, and more types of data from more sources; they also need to gather, analyze and act on this data far more quickly than ever before. However, traditional migration or batch ETL (extract, transform and load) technologies often fail to deliver customer insights with the required levels of data integration, data quality and cost effectiveness.

These tools often cause excessive latency, require expensive replication infrastructure and specialized staff, and result in lengthy delays in onboarding new data or data types. These custom-built solutions also often take too long to implement, provide little visibility into performance or cost, and lack adequate security and governance.

Seeing through the Data

Change data capture (CDC) can reduce or eliminate many of these challenges by using log-based replication to migrate only data that has changed since it was last accessed. Using Google’s CDC capabilities and our data management accelerators, we helped a leading toy manufacturer achieve double-digit cost reductions and efficiency increases related to its reporting and inventory management functionality.

The manufacturer wanted to better understand the effects of pricing, promotions and events such as movie releases on the sales of associated products. Before engaging us, the manufacturer relied on the manual creation and consolidation of reports from systems such as Amazon Retail Analytics Premium (now called Amazon Brand Analytics), ERP and CRM systems.

We utilized the BigQuery data warehouse to deliver views and reports on everything from sales trends to customer product ratings and inventory and shipments. Creating time series analyses in BigQuery made the data more easily available for “slice-and-dice” analysis, allowing the manufacturer to provide hyper-personalized offers to customers, reduce reporting effort by 45% and lower inventory management effort by 30%.

The Tools at Work

One of the key issues we needed to solve was to create real-time links to sales data from multiple retailers into BigQuery. To achieve this, we leveraged Datastream, a cloud-native, serverless and easy-to-use CDC and replication solution, together with:

  • Replication, a Cloud Data Fusion application that empowers enterprise ETL developers and data analysts to quickly load operational data into BigQuery
  • Dataflow CDC Templates that allow teams with deep data engineering expertise to customize CDC streams

Our accelerators work with Google’s CDC solutions to reduce the total cost of data by 40%, speed time to insight by 30% to 40% and achieve time to value in as little as three months, according to our benchmarks.

The need for data-driven customer insights will only grow, in both the near and long term. Add in the need for speed, and it’s clear that businesses need a new strategy for sharpening and speeding their insights into customer needs and the best way to deliver them.

Read more about CDC and recent enhancements to Google’s CDC solution here. Attend the May 26 breakout session at the Google Data Cloud Summit.

Munira Gandhi

Munira Gandhi

Munira Gandhi is Google Cloud Practice Leader at Cognizant, focused on presales and GCP community growth. She is a Google Cloud certified... Read more