How Data Quality Drives Digital Transformation – Part I of II

Written by John Kosturos on September 19, 2020

How Data Quality Drives Digital Transformation – Part I of II

First in a two-part series of articles on the role of data quality in Digital Transformation

It would be an understatement to say we’ve been hearing a great deal about Digital Transformation (DX) in the last few years. Hype abounds, but DX still deserves attention. It is a relevant, viable concept. Businesses can leverage digital technology to effect dramatic changes in the way they function. In fact, I agree with Thomas Siebel, that most businesses in today’s market need to adopt digital transformation, or risk extinction.

I thought it would be worthwhile, therefore, to devote some attention to DX and what it really involves. In particular, I want to highlight the critical role of data quality in making DX an operational reality.

What is Digital Transformation, really?

First, let’s talk about what DX isn’t. Getting a new piece of software like CRM that incrementally improves your business is not DX. It may be a great move, but you’re not transforming your company. That’s just the regular evolution of business practices. DX is different. It involves using technology to accomplish something revolutionary.

The DX process ranges from simple to extreme. In a simple DX scenario, a company employs software and hardware, usually Internet of Things (IoT), to reimagine some core areas of its business. For example, it might transform customer relationships with omnichannel customer engagement. A more intense approach to DX might mean reinventing the business based on technology. Thus, a print newspaper becomes a digital news outlet.

Alternatively, the most serious digital transformations involve reinventing entire industries. For example, social media is the new, digitally native version of traditional media—and it’s taking over. Print news businesses are collapsing. Even TV news is struggling as audiences flee to social media platforms. Indeed, the mandate for DX often comes from existential risk.

Avoiding extinction

Thomas Siebel uses the term “Punctuated Equilibrium” to describe the disruptive cycle of business technology in his influential book, Digital Transformation: Survive and Thrive and an era of Mass Extinction. To Siebel, just as the dinosaurs were killed off by a rapid change in Earth’s atmosphere, so too are businesses destroyed by technological changes that disrupt the market equilibrium that once supported them.

For example, Kodak could not survive the shift from film to digital cameras. The yellow pages could not survive consumers switching to Google, and so forth. DX offers a life preserver. It offers companies a way to invent the change that will help them thrive and survive—instead of waiting for someone else to change the game and render them obsolete.

Punctuated equilibrium occurs when a confluence of technological factors brings forth a discovery of what customers really want, versus what they’ve been told to want. A combination of hardware, software and data releases the customer from the terms of a relationship previously dictated by the company. The legacy business loses control of the customer relationship as a result of technological disruption. As Kodak discovered, for example, consumers didn’t want prints of photos on Kodak paper, which was their main product. People wanted pictures to share with friends and family—and a smartphone was a superior, faster and cheaper way to enjoy this experience. Or, as yellow pages publishers learned, consumers wanted the find phone numbers instantly, not take delivery of a bulky costly book on their front doorsteps.

In the yellow pages case, the extinction only occurred once the technological elements of disruption were in place: pervasive PCs, universal Internet connectivity and comprehensive business phone number listings online. All three had to exist before the yellow pages business collapsed. This took a while. I can recall speaking with yellow pages publishers in the 1990s, telling them, basically, that their business was about to be demolished. There wasn’t a lot they could do about it, honestly, but if you’re in business today, it’s wise to scan the horizon to spot potential asteroids hurtling toward you.

Data and DX

Siebel posits that DX occurs along four dimensions, which he calls “Pillars.” They are Big Data, data analytics, Artificial Intelligence (AI) and IoT. For each pillar, data is critical to achieving the desired goals of transformation. To be precise, what’s really needed is what is known as “normalized data.” Normalized data is data that is uniform in format and field-level content, no matter the source. For instance, is one data set uses “AZ” to indicate Arizona, and another data sources uses “Ariz.”, this kind of mismatch renders so much of data analytics and data management essentially worthless. Normalization sets a standard so both data sets will say “AZ,” if that is the chosen normalized choice.

To illustrate what I mean, I’ll use the example of social media sites disrupting the traditional news media:

  • Big Data – The “Big” in Big Data refers as much to the variety of data sources and formats as it does to data volumes. Social media sites are great at mining highly heterogeneous (but normalized) data repositories and real time data streams to discover the stories that are most relevant to a given site user at a given time. This pillar of DX of course requires strong data management capabilities.
  • Data Analytics – A social media site is able to present the most relevant stories to each user based on large-scale data analytics workloads. To understand the nuanced needs of billions of users simultaneously—based on their clicks, demographics, friends and so forth—is a massive data analysis and data quality challenge. Running analytics on duplicative or mis-formatted (not-normalized) data will return sub-optimal results.
  • Artificial Intelligence (AI)— It takes a sophisticated AI algorithm to enable a social media site to “know” what stories people want to see, and with whom they might want to share. The software has to “think” about what each person might want to see and share, in real time. In this context, data quality is of paramount importance. The AI algorithm will only work properly if it’s parsing high quality data.
  •  IoT—Discussions of IoT tend to focus mostly on the sensor technology, the “things,” so to speak. However, seen from another angle, IoT is inherently a matter of data management. IoT devices generate enormous volumes of data at the “edge” of the network. This data must be managed and stored if it is to have any impact on the DX it is helping to power. In the social media case, IoT emerges in the form of mobile device usage data, e.g. when and where people consume social media stories. This data flows into the algorithm, bolstering its effectiveness.

Can a newspaper website ever hope to compete with such a disruption? The answer is yes, but it’s not an easy proposition. Their secret to success would reside within their data, however. A newspaper which operates online actually has very rich data on readers’ preferences. With the right AI and Big Data tools, they could probably outrun social media in the race to present relevant news to dedicated readers.

For this to work, though, the newspaper would have to have its data in order. This is all about data quality. If the newspaper’s data sits non-normalized in separate silos, with errors, obsolete fields and irregular schemas, the AI and Big Data tools won’t be of much help.

How can a company get its data into a state of quality where it will facilitate DX? Stay tuned for the second part in this series: Data Orchestration and DX.