3x Marketo Champion Elliott Lowe shares his insights into Marketo data management at the New York Marketo User Group.
The Cost of Dirty Data on Demand Generation
“Dirty data” is an umbrella term for various data quality issues: duplicates, incomplete records, outdated information, inaccuracy, and non-standardized data.
Having just one or more of these issues should serve as an indicator that an organization’s demand generation strategy is not optimal. Generating leads becomes much easier with clean data.
Elliott Lowe cites a study that shows that organizations with strong data quality processes in place have a much greater quantity of usable records, ultimately resulting in 66% higher revenue generation than a company without similar processes.
How to Identify your ICP/ Personas
3x Marketo Champion Elliott Lowe teaches you the best practices for identifying and defining your ideal customer profile (ICP) and how to target personas.
By using a combination of static data (demographics, firmographics, social data) and non-static elements (intent data & engagement metrics) you can get a more accurate picture of your total addressable market.
The more data points your sales and marketing teams have to work with, the better their ability to segment the audience and personalize messaging based on these segments.
How to Identify Critical Data Points
3x Marketo Champion Elliott Lowe teaches you the best practices for identifying critical data points to help in your ABM strategy.
Categorizing your data fields by type and function, and assessing each one’s usefulness to your marketing efforts, is the first step to a more successful endeavor. All marketing processes -from nurturing to lead scoring, routing, and segmentation- need as many data points as possible to gain more complete profiles of the target.
How to Assess Data Quality and Accuracy
3x Marketo Champion Elliott Lowe teaches you the best practices for auditing your Marketo database.
Once you’ve identified the data points most useful for your marketing efforts, analyzing these fields for completeness and accuracy is crucial to maximizing the effectiveness of your campaigns. Data migration, CRM syncs, and human error are the most common sources of substandard data in Marketo.
How to Analyze Fill Rates
3x Marketo Champion Elliott Lowe teaches you how to analyze fill rates in Marketo to assess your system for record completeness.
Though you may have ample records in your system, they are meaningless with empty fields. Assessing the fill rates of your records can help uncover gaps in your data collection processes and give you the resources to gather more complete profiles going forward.
Data Enrichment Options
3x Marketo Champion Elliott Lowe teaches you the options an organization has when deciding to enrich their Marketo records with information from outside data vendors and providers.
Data enrichment uses 3rd party data sources (such as DNB, Experian, Nielsen) to append existing and incoming records with supplemental information. This practice is common in organizations that seek to hyper-personalize campaigns and nurture streams.
Marketo User Group Q&A
3x Marketo Champion Elliott Lowe answers questions from the audience about Marketo data management.
- One of the things that I want to ask you about is how to identify unengaged records, come up with some classes to just get them out of the way so we can make way for a new data?
- What does RingLead do as an organization when you approach a client where there’s a hoarding culture? They don’t want you to delete anything. So, how do you guys approach that?
Elliott Lowe: So, basically the presentation today actually is a great complement to the one that Alex just gave because you’re developing all these wonderful content, but it’s like so, “How do I know who this person is or this company is, and what are they interested in? What are their firmographics?” And I’m not just talking like, “Oh, they’re a billion-dollar company located in Seattle, Washington” or something like that. I’m talking like super deep firmographics, like what are they doing on their website, what kind of technology are they using on that website, what social media are they engaged in so you can interact with them in a more natural way than just sending them an email. So, we’re going to go through some of those steps to building that data platform. You can use them to have much more targeted and personalized communications.
Okay. So today, the first thing just sort of a review because I’ve given this slide in different forms in the past, in that serious decisions had a great study about the cost of bad data on demand generation. So, we’ll check that out. We’ll talk a little bit about first step to doing this, really is to understand who you’re trying to target, then identifying critical data points that you need to be able to functionally target that individual. Then once you’ve identified your data points, figure out how bad is my data, how much missing data do I have, how much data is in the wrong field, and I’ll show you some examples of that in a moment. And finally the fill rates, which is after the data quality analysis.
Then you want to see once you’ve gotten identified all the bad records, records you don’t want, and they might have all sorts of data. But if they’re bad records, you can’t email them. Then you want to take them out and not use them in the calculation of your fill rates, so you don’t want to get a false sense of, “Oh, I’ve got lots of data” but you can’t use it for marketing to anybody. And finally we’ll talk a little bit about the options for enriching data once you’ve identified where you’ve got gaps and where you need more data.
So, a little bit briefly about what RingLead does. So, we’re involved in, of course, finding for 15 years now, I think, 2004. We founded identifying emerging duplicate records, both in Marketo and Salesforce and now soon we’ll also be doing that for the Eloqua platform. We also do prevention of the duplicate records in those platforms. We also facilitate routing in matching leads to accounts, enriching, which I’ll go into in more depth here in a moment, enriching the data, and then one of the things we do is we aggregate data from like 12 different vendors. So, it’s not like you’re going to a single DNB and you’re only getting their data and no other data. So, we have the ability to aggregate it and from that we’re able to not only give you unique data that you might not be able to afford from a single vendor, but also a compendium of data where we will make sure that we’re giving you the data from the best source that [inaudible] needs. And finally, we actually do provide consulting services to pretty large enterprises indirectly, so a lot of times when they go into a data quality project, it’s overwhelming and so they look to us to help them with part of that.
The Cost of Dirty Data on Demand Generation
So, this is the slide I promised about dirty data. So, the idea here with serious decision study was they took a hypothetical 100,000 record database and they analyzed it for different levels of data quality as well as the performance metrics, and what they found is you can see from these people had very strong data quality processes in place, have much higher percentage of these records, and that cascades down through the waterfall and ultimately results in 6% higher revenue generation from any given inquiry than an average data quality company.
So, definitely, it’s important to do this and why the big disparity? What contributes to that? So, the disparity is based on a lot of things, and one of them can be things like poor lead scoring. If you don’t have good data in your database, you’re not going to get good firmographic scores, and if you have duplicate data in your database, you’re going to dilute scores so that’s one of those. You can have bad email addresses, for example, that you think are good and you’re trying to send emails or bad phone numbers, and people are wasting time trying to call those people and it gives you inflated view of your capability, the reach of the database.
So, you definitely want to analyze that, understand that so you can address it and correct it. If you don’t have correct data, then records are potentially going to get routed to the wrong salespeople or in the case of a duplicate, potentially two different salespeople trying to call on the same person and God forbid that person they think they’re calling on is a prospect is actually a customer and you don’t have a way to relate your lead records to your account records, which is something that can be done. And again, all of this wastes salespeople time, it lowers the confidence of the customer and your ability to deliver on your promise to them with your product.
And I guess the last thing that I mentioned that really impacts this, again back to personalization, is the inability to really finally personalize your communications of these individuals. It can impact your ability to develop ideal customer profiles and implement it on a ABM-type project. In fact, I’m curious. There’s definitely in a continuum of ABM in various companies. In fact, at one point I worked with Engage and we developed ABM maturity model, which you can go to their site and take that and see, but I’m curious, just a show of hands. Who in the room feels that they’re very advanced in their account-based marketing right now? Not too many people. So, that’s always the end of the rainbow, right? And I’m not saying what I’m going to talk about today’s going to get you there, but it’s little baby steps, it’s a journey of a thousand miles, and so these are just the steps to get there.
Now, who is using like a hybrid-type ABM model where content’s not always particularly personalized down to an individual level but at least you’re doing things like you understand in an industry or you understand the scale or the size of a particular company? How many people have that in their? And then I assume the rest of you are still struggling even to get to that point, which again these steps will help move you along that journey to getting where you can do that type of stuff.
How to Identify your ICP/ Personas
So as I mentioned before, you have to have a goal on your destination on your journey, and that destination is identifying who your ideal customer profile is. Now an ideal customer profile, really there’s a lot of different things you would look at to do that. You’re going to look at things like anecdotal information from your sales team, would be a great first start, but ultimately you want to migrate beyond that and start looking at correlation between data elements, both static data elements having to do with firmographics but also data elements having to do with engagement and also intent. Intent data is incredibly important in understanding where somebody is in their buyer’s journey so you can create your messaging, you can communicate very crisply to them about where they are and what their needs are. Oh, and by the way at the end of my slide presentation, we’ll provide this to everybody. There’s a link to many resources, and one of those is a worksheet from Terminus on developing an ICP. So, it’s actually a pretty good tool.
How to Identify Critical Data Points
So, unfortunately, I had this set up for animation, where these would build so you wouldn’t have to digest all of them at once, but we’ll go through them one by one. So, the idea here is you want to take an inventory of all of the critical data points that relate to not only how can I figure out, I’ve identified who my ideal customers are and I need to know the following pieces of information in order to identify those people. So, you want to definitely identify those data points, but beyond that there’s also data points that you’re going to need to know to have that personalization so as you engage with them. So, somebody that’s in a food retailing business, for example, might have very different data points than an industrial manufacturer. So, that’s an example of the kind of data points in a lot of the disparity and we provide a lot of that through our tool, and we’ll get into that in a moment.
So basically, the most common use cases you’re going to run into in this scenario then are things like lead scoring, where you need to know in industry revenue, employees, things like that. Routing, where you might want your territories might be divided up by geography or by size of company, so you need that type of data. When you want to segment your database for targeting with communication messages, you want to know what that person’s standardized job title is, for example, and again the industry and possibly even things like his education background in prior jobs, which we’ll get into in a moment. Then when you want to do that personalization, you’re going to have things like the company name, first name/last name title and so forth. And finally, for prospecting, you need to know how to communicate and contact that individual.
So, these are some of the major data categories, both for persons and for companies. And this is just the tip of the iceberg, actually. There are hundreds of data fields, which I’m going to very briefly go through and maybe talk about a couple, but they’re in the dex so when you get it you’ll kind of get a sense, and I would encourage you to go through and think about each of these individual data elements and how you might be able to leverage that to improve the personalization of your communications.
So, the company and person basically have somewhat similar information in the first few categories, but then for a person you can actually get information about his education, his interests, past jobs he’s had, and at the company level you can get an idea about their social media presence as with an individual their digital presence, their industry classifications, not just NICIS but SIC and multiple values for those and then their financial situation.
So, these are some examples of the basic data you would want for an individual, and you can see one of the things that you can even get is a link to their bio potentially. So, if they have a bio out there, that’s a wealth of information for a sales rep to read through and prepare before he has a communication reaching out to an individual so he can much better understand that individual’s background. Also what language they have and potentially if they have their own website, how old they are, their gender, things of that nature. Their contact information, then, including facts still. Not too many people use that, I don’t think.
And email verification, which again is pretty important. So not only maybe somebody had a typo when they were entering a forum on your website, or maybe they’re not a good person and they’re putting a spam trap into your database server, bots doing that somehow. So, the email verification can run automatically before you ever send an email, so you can ensure that you’re not sending undeliverable emails which are going to impact your sender reputation, and worse, if it’s a spam trap, if you send one of those out, you’re going to be hosed for quite a while while you try to struggle and get yourself off blacklists. Oh, also if an individual has more than just a single email or a personal email and a work email, all the emails that they’ve ever had or used in the past can be found. So, if you’re not having success with them in one of the other emails, you can also try one of the other emails that’s available.
Locational information, including the time zone, their home, their work, social media, where they hang out. If they have addresses on these places, a great place, again, to follow and see and you can get ideas about intent if you monitor places where they tend to post a lot and also you can engage with them. Their educational level and where they went to school, which is always great for communications if you can reference things like that. Occupational background – where they are now, the department they’re in, previous positions they’ve held at other companies and how long they’ve been there. And finally personal interest. This is really golden, again, if you’re trying to customize. You could have a campaign dedicated to people that are focused on sports and you can create a whole sports campaign around that, targeting various individuals that have proven to have that interest.
So, the company data’s pretty standard-type information here, including parent companies and whether they’re public or not. Some of the contact data for the company itself, including the email pattern, which can be helpful, so if you identify an individual but you can’t, for whatever reason, get that email address or if you entered it into your database and you ran an enrichment tool, it doesn’t come back with one, sometimes the email pattern allow you to guess and you can actually succeed in connecting with that individual through email. I wouldn’t advise doing that frequently, but it’s there as an emergency stop if you need it, toll-free numbers, main company email.
Again company geolocation, including things like how many PCs do they have at that location, for example, which informs you to some extent how many people, technology-oriented people workers are at that company, at the headquarters. Their social media presence, again, great place to hang out, follow them, get notifications about things that they’re interested in and they’re doing to drive personalized content. And their digital presence – their ISP, website, and company industry classification, including dunce numbers, QSP, EIN, and things of that nature. Finally, company financials. So, actually back on that company, and again, in addition all of these fields and data categories have talked about definitely consider the intent and engagement data that’s just super important.
So as I mentioned before, the serious decisions did that study about the impact of bad data. Part of that study was they determined that up to 25% of a given company’s data is garbage in that study, which is pretty amazing. They also, I don’t know if it was that study or another one, said that your data degrades about 3% of your contacts and your database to grade every month. So in the course of a year, that’s a third of your database has been degraded. So, you can’t really and you probably don’t want to impose on individuals to keep coming back and giving you data over and over and over again, data you don’t have.
So this is where there’s some options, and we’ll talk about those in a moment. So, before you try to figure out again like, “Well, do I have data? Am I missing data? What do I need to do? Should I enrich the data?” You don’t want to start doing things like merging records together, enriching records that are actually bad records, the person’s no longer at the company, things of that nature.
How to Assess Data Quality and Accuracy
So, you want to do an audit of the data. One of the things as part of that audit is look at your data sources. So you want to understand where’s the data coming from, and if you find troves of data that are really sketchy and bad, missing information, most of the people have unsubscribed or their email addresses are bad, you want to make sure you identified what those sources were so you can either stop using those sources in the future or at least add additional fields potentially on a forum, or better yet, perhaps have an intelligent forum that can in the background add data into hidden fields once that individual gives you their email address and their company name.
You also want to check your CRM sync. I see a lot of problems, lots of problems with data, where the sync is not recurring properly between Marketo and Sales Force or dynamics potentially, mostly Sales Force my background. But you can line up situations where data salespeople are entering data on their side but it’s not getting into Marketo or vice-versa because there’s a validation rule in place that’s preventing the syncing from occurring, or you’ve got a thing that happened to us recently. It happened over a year ago, but it happened again. Someone had installed quite a few apps in our Sales Force or over the last year, and one of them had just bajilions of APEX triggers. And so there’s a limit on the CPU. I can’t remember what it’s called, APEX CPU limit or something like that. But when you try to sync the record, if it triggers APEx it’ll actually come back in Sales Force and say, “I can’t do this because you’re out of your CPU limit on APEX”. So, if you’ve got garbage things like that, you need to uninstall them so that you can improve the sync performance and get your data back and forth between the systems.
And then as I mentioned, you don’t want to go out then and use records that really aren’t engaging with you, that are undeliverable and use them as part of your analysis of, “Do I have a good fill rate on my data? Oh, I don’t? Oh, okay. I should go enrich that”. You don’t want to enrich bad data records, so you definitely want to purge those out of your system and then after you’ve done that, if you’ve got duplicate records you want to merge those together because again, you don’t want to be enriching duplicate records and wasting a bunch of money doing that.
How to Analyze Fill Rates
So, I mentioned the fill rates. So, how do you analyze the fill rates? Let’s say if you only have a handful, if you’re at the early stages of trying to get personalization data that you can use for your communications or to create segmentations for ABM and so forth, you could potentially do things like just manually analyzing the fill rates using, say Marketo or CRM or Ports. Or if you have lots of fields and lots of records, you may decide you actually want to use a dedicated app to help you analyze that.
So, here’s an example of a report I created in Marketo. It’s not even a real-world example because I limited to just a few countries, but you can see that there’s, you’re empty, and then you’ve got the totals. So, you do the math and you come up with your fill rate on that particular field. But you don’t want to have to be doing that for dozens of fields on some periodic schedule.
So instead, actually RingLead has an app. It’s a free app. It’s called Field Trip that you can have your Sales Force admin install, and it’ll actually look at, you can define those critical data points, those fields that you want to know about, and it’ll calculate whether they have data in it or not. It’s a bit of a misnomer that says ‘data quality’. It’s really ‘data fill rate’ is what it’s looking at. It’s not trying to ascertain whether you’ve got somebody’s first name and a country field or something like that. And for that, actually, I’m not aware of too many apps that can do some of that type of stuff, but you will want to do some sampling of your data to look for, especially when you do list imports. Oh, my gosh, that is the worst, and I was talking to Alex earlier about this. You’ve got salespeople do list imports, and they don’t pay attention to columns. They just bring the data.
I had somebody that actually mapped the email field to a separate email field, so the records came in without an email address, and so of course that wound up in causing a lot of duplicates and of course you couldn’t email those people. So, I found the email, was able to move it into the right field, but you spend hours and hours and hours doing that, so it’s really important to set up permissions so that people can’t go rogue on you – salespeople, other marketers potentially – and cause issues with your data like that that you then have to go clean up. This actually is a little more representative of the detail you can see, and so you can see critical fields and the actual fill rates, so you can even set weightings, and that’s what controls that scale, that gauge as to whether it’s way over in the green or way back in the red, is the weightings in the fields you define that you want to manage.
I hope this works. Oh, it did. The slide did work here. Okay. So, as we were talking about if there’s just a few fields that you’re interested in, you don’t have that many records, potentially you could get an intern or a coordinator or something in your marketing department to go out and do the research. Go to LinkedIn and fill in the missing data for these people. But again, if they do that you’re going to have data entry errors, you’re going to have people overlooking fields and things like that. It’s cheap, relatively speaking, depending on how expensive your resource is, but it’s not really a scalable option if you’ve got more than a handful of fields or just missing a few records, missing a few fields.
Data Enrichment Options
So, the other options use an offline data provider, and again, this is sort of using identifying a data provider you think can provide you the most coverage for what you need. But the issue there is then you have to export your data out, send it off. Days later, maybe even longer, you get the data back. Now you have to load it in, and if you were to try to load it based on email address and you have duplicates in your database, then that data may not map to the correct record, definitely not all of the duplicates. It’ll only map to one of the [inaudible] records based on email address. So, offline data providers, while it can be a big batch update which you want to do, it can be tempting to do that, but again keep in mind the latency of the time you send it off and getting it back and reviewing the data and so forth.
And then the last one would be an option with an online data provider. So in this scenario, you can do the enrichment in real time on specific sets of data. So, you don’t necessarily have to enrich your whole database at one time budget allowing. You can just create segments. Maybe I’m going to be doing a campaign targeting a certain segment of my database. I want to make sure I’ve got the freshest, most complete data. So, you would just run on that segment of this month and then next month go on do another segment.
Another option that you have is if you want to maintain the quality of your data rather than doing it in batches, you can have the enrichment run on a trigger basis when a new record gets created so that you’re able to actually populate all of the appropriate fields so you can route that record that record to the right person, you can put them in the right segment, and you can communicate with them in a personalized fashion available.
So, that’s pretty much it. Oh, the one thing I would like to enrich that data make sure then after you get the enriched data that you do … normalize it because you don’t want a mixture of, say, two letter ISO codes for country and full country names as an example, or some names are all uppercase, some lowercase, and that’s going to look like hell on your emails that you send out. So, definitely if you do use enrichment, make sure you’re running normalization behind that, either normalization you’ve set up in Marketo, although Marketo’s relatively limited to string normalization, things like that, so again you would either have to do it manually by exporting it in Excel, blah-blah-blah, or you can use one of the data enrichment providers, and that is something we do.
So, these are links to that Field Trip tool that’s free also. Also the Dupe Dive, which can identify how many duplicates you have in your database, and not just exact email like Marketo, but you can actually specify fuzzy matching criteria as well, so things like first name/last name, company name match because you can definitely get different formats for email address or maybe you have a personal email and a corporate email but it’s the same person. This is actually a great tool for getting kind of the steps that you can take to ensure that you’ve gotten most of those bad, undeliverable, unengaged records out before you start on analyzing your fill rates and then enriching your data, and this actually, some of these steps in the presentation I just gave and the worksheet from Terminus.
Marketo User Group Q&A
Q1: Just want to say, great presentation. You touched on a lot of challenges that we face.
One of the things that I want to ask you about is how to identify unengaged records, come up with some classes to just get them out of the way so we can make way for a new data.
A2: Right. So, actually that previous document I showed here, Ten Ways to Reduce Bad Data in Marketo, actually one of the steps shows you how to set up a smart list and/or smart campaign, maybe better yet because if you do purge those records out and you’re using a smart campaign to purge them, while their first name/last name email is removed from the results, you can still see the activity ID number so you still have a sense of how many people that ran on, even after you’ve removed them from the database. So, it’s in there. I also have a video, and I can’t remember if that’s out yet, the unengaged. Have we done that one yet?
We had one similar to that one. Not as similar of a process.
So, there’s the ten steps actually. There’s a video for each step, and we’ve published five of them and the other five will be coming out over the next few weeks. Anyway, this document will definitely help you in that regard.
Q2: Great presentation again, and just quickly I know one of your points was really focusing on how an organization really needs to be diligent about purging bad records.
What does RingLead do as an organization when you approach a client where there’s, I guess a culture of.. What’s the word I’m looking for? Maybe a hoarding culture? They don’t want you to delete anything. So, how do you guys approach that?
A2: Right, right. So, keep in mind there’s two types of deletion. There’s one where you completely get rid of the record altogether, not in your CRMs, not in your marketing automation. It’s gone. There’s another kind where you can get rid of an amount of marketing automation. So, these ten steps is a lot about getting rid of it at least on the marketing automation side, but then you need to have some sort of sync filter in place or use, perhaps, record type and profile permissions for the Marketo sync user to prevent Marketo’s seeing records that sales still wants. They want to hoard them. Fine. Maybe it’s got a good number, maybe not, but the email address is obviously bad. The guy’s probably not at that company anymore and in fact one of the enrichment vendors that we’re working with, we’re working with him to give up, be able to identify people that are no longer at a company, not just the email address is bad but they’re not there. We know they’re not there anymore. So anyway, you can at least remove them from Marketo and leave the hoarders. Now, are your hoarders also marketing people?
Sort of. I think for us, at least, we see a lot of hoarding. If you want to call it that, I guess we’ll use that word. From a historical perspective. A lot of that they want to hoard are contact records. They don’t want to delete any sort of contracts that may have been-
Good point. In the past, depending on how many records you have in your database, how expensive your Marketo license is and all of that, I have kept records as old as five quarters that I knew were not engaging with me, I tried to engage, they didn’t engage. Maybe I knew they were bad. Those records obviously you want to exclude probably from enrichment at a minimum. When you do enrich as you’re probably also maybe excluding him from communications, maybe automatically if the email invalid is true, but if you do that, so like I say, I’ve kept them for five quarters so I could quarter over quarter a year in the past, and then at the end of the fifth quarter, boom. Then they went out. So, that was actually acceptable. Everybody was comfortable with that. Again, it’s a matter of what’s the cost to keep bad data around.
One of the things that I’ve done more recently where I’m tending to getting rid of records only after maybe a few quarters, rather than five, is making sure I’ve got go-to reports that show me like, “So, where did all my bad data come from? What are the sources of all my unsubscribes?” Because you don’t want to be like just, “Oh, this is bad. Let’s get rid of them” but you have no historical perspective, you have no attribution that you can use to decide, “Well, that source is horrible. Let’s not use that source again”. So definitely, you want to lock down reports, save them. They obviously get mailed at Marketo if you’re using reports at Marketo, and just hoard those instead of the data. It’s a lot cheaper.