In his book, Data Quality: The Accuracy Dimension, Jack Olson explained that CRM data accuracy refers to whether data values are correct. To be correct, Olson argued, a data value must be both the right value and be represented in an unambiguous form, which is why he declared the two characteristics of data accuracy as form and content.
“Form is important because it eliminates ambiguities about the content,” Olson explained. Form dictates how a data value is represented, and Olson used his birth date (December 13, 1941) as an example of how you can't always tell the representation from the value. If a database was expecting birth dates in a United States representation, a value of 12/13/1941 would be correct,12/14/1941 would be inaccurate because it’s the wrong value, and 13/12/1941 would be inaccurate because it’s the European representation.
In the case of February 5, 1944, the United States representation is 02/05/1944, whereas the European representation is 05/02/1944, which could be misunderstood as May 2, 1944. Because of this ambiguity, a user would not know whether a birth date was invalid or just erroneously represented. “A value is not accurate,” Olson explained, “if the user cannot tell what it is.”
As for content, Olson explained that “two data values can be both correct and unambiguous yet still cause problems.” This is a common challenge with free-form text, such as a city name. “The data values St Louis and Saint Louis may both refer to the same city, but the recordings are inconsistent, and thus at least one of them is inaccurate.” Consistency is a part of accuracy, according to Olson, because “inconsistent values cannot be accurately aggregated and compared. Since much of data usage involves comparisons and aggregations, inconsistencies create an opportunity for the inaccurate usage of data.”
“The definition of a value being valid,” Olson explained, “means that the value is in the collection of possible accurate values, and is repreented in an unambiguous and consistent way. It means that the value has the potential to be accurate. It does not mean that it is accurate. To be accurate, it must also be the correct value.”
“Defining all values that are valid for a data element is useful because it allows invalid values to be easily spotted and rejected from the database. However, we often mistakenly think values are accurate because they are valid. For example, if a data element is used to store the color of a person’s eyes, a value of Truck is invalid. A value of Brown for my eye color would be valid but inaccurate, in that my real eye color is blue.”
“The short answer is no,” Olson explained. “There will always be some amount of data in any database that is inaccurate. There may be no data that is invalid. However, as we have seen, being valid is not the same thing as being accurate.” Olson noted it’s rare that an application would demand 100% accurate data to satisfy its business requirements, which is why he said, “The long answer is yes. You can get accurate data to a degree that makes it highly useful for all intended requirements.”
This post originally appeared in the OCDQ Blog.
Jim Harris is data quality expert, a freelance writer, professional speaker, thought leader, and independent consultant. He runs the Obsessive-Compulsive Data Quality (OCDQ) Blog offering a vendor-neutral perspective on data quality and its related disciplines.
CRM data deduplication is like an equation waiting to be solved. Just like a high school math, the more variables you know the values for, the easier the problem is to solve.
CRMs include is a vast number of data markers. These markers are a literal road map to filling in missing data. For example, if 100% of the emails for a particular company have the email format of Firstname.Lastname@domain.com, then you can fill in missing emails for other contacts with confidence. If you have the email domains for contacts, but the account record is lacking a website, that can be filled in too.
A company’s website address is a unique identifier. It is more important than the company name. Look at Peoplesoft as an example. Long after the company was acquired by Oracle, you could navigate to www.peoplesoft.com. A few years later, it was absorbed, but the website did outlive the company.
Done properly, leveraging data markers within a CRM allows the you to pre-fill data for a more complete picture before deduping. In addition the data should be standardized (sometimes called normalized) before the dedupe process is done.
If you take the correct measures of: (1) data normalization and (2) data fill before deduping, your dedupe process will be greatly improved.
A Data Plan is the protection against duplicates, and the protection against bad data. Bad data comes in via many routes including technology, the users entering the data, lists, etc. No matter what avenue the data comes in, a lack of process and planning will result in bad data.
It’s important to note that the Data Plan goes beyond CRM. Marketing automation, CRM, SFA, ERP, accounting systems, etc. all benefit from a Data Plan. Have you ever not received payment because you sent a bill to the wrong division of a big company? That’s due to bad data.
The Data Plan encompasses a strategic, corrective, and preventative set of processes around what’s going to happen to data.
No matter what system or platform process, make data validation, data compilation and data standardization a priority.
Donato Diorio is the Acting CEO of RingLead with primary responsibility of combining the teams, technology and vision of RingLead and Broadlook Technologies, a Data-Technology company founded by Donato. Donato started his career as a software engineer and quickly became a specialist in process automation. Software architect first, top billing recruiter second, Donato combined his two careers; technology and talent selection to found Broadlook Technologies in 2002. A recognized thought leader and speaker on Data Quality and Recruitment; two diverse areas technology dependent and process-driven, Donato’s mantra is “Build technology that is the right balance between automation and human interaction.” Executing this vision has delivered consistent innovation for Broadlook and now the RingLead-Broadlook combination.
In order assess the quality of your cold emails, all you need to do is check your email response rates: If you’re sending to a targeted list and your response rate is less than 10% then you should spend more time perfecting your cold email templates.