Why Is Good Data Vital Currency and Bad Data a Disaster?
Issue 49: March 31, 2022
We often talk about the common problem of collecting massive amounts of data without a strategy to transform it into intelligent action-oriented insights. We are all drowning in data as technology and digital interactions have allowed us to collect and store more data and offer access to it. Given the large amounts of data now at our fingertips, we need data scientists and analytical engineers to make sense of it all. We flatter ourselves with the amount of data collected as seemingly size matters and assume the data we have collected is reliable, valuable and meaningful. As we discussed last week, there is also some confusion as to what an output versus an outcome is, and data only adds to the confusion and challenge.
There is a more insidious reality in data management. We don’t identify and assess the right input that assesses outcome measurement. We also make faulty assumptions of incomplete or bad data, we assign the wrong definition and meaning to the data, and we don’t often question its value. Human error is often the root of faulty data analytics and what is measured as a result. If you don’t ask the right questions in the context of the specific problem, how can you expect the data to reflect the appropriate answers? If the data is incorrectly defined, incomplete, or not valuable, how then do you make organizational decisions or truly understand organizational performance?
We are going to explore a topic that is frequently overlooked by management when establishing a data practice: how bad or faulty data can lead to disastrous decision making that results in taking the wrong actions or responses — often with significant consequences.
What Is Bad Data?
High-quality data that is appropriately and contextually defined and joined with other relevant data is accurate, action-oriented and therefore valuable. Bad data is data that is incomplete, incorrect, wrongly defined and/or irrelevant. Bad data can also be information that is outdated, duplicated in the database, unformatted (typos, spelling variations, inconsistencies) and inaccurately recorded.
Bad data can also be less obvious when data coders tinge the input from their conscious or unconscious biases. If coders provide facial recognition artificial intelligence applications with images of only one gender or race — or segments of each that don’t represent the whole of society — the output is faulty, and any decisions made on the output are incorrect. If determinations of audience and/or market are made with a small set of data that doesn’t reflect all the descriptive, identifying demographical data and its nuances, any decision becomes a small dart attempting to hit the huge bullseye. The only way to hit the target is sheer luck.
As reported in Forbes, according to research firm Gartner, “organizations believe poor data quality is responsible for an average of $15 million per year in losses.” Gartner also found that nearly 60% of those surveyed didn’t know how much bad data costs their businesses because they don’t measure it in the first place.” If something isn’t understood, assessed or measured then how do you make decisions or take action on it? How do you learn from it?
Also reported in Forbes, “a study on the lack of trust around data from research firm Vanson Bourne (commissioned by SnapLogic) found that 91% of IT decision-makers believe they need to improve the quality of data in their organizations, while 77% said they lack trust in their organization’s business data.”
Quality of data is often assumed to be the responsibility of the IT Department or the applications and machines themselves. We trust that applications and machines are accurate and effective since as we know we can make mistakes. Data managed by the IT Department and machines and applications they oversee are downstream collection points of data. Consider garbage in, garbage out as data that is collected downstream in data lakes, customer data platforms (CDPs), data management platforms (DMP) and other hype-based “repository platforms and toolsets” that overpromise the ability to create immediate insights and intelligence.
We overlook the upstream inputs and sources of data that predominantly rely on humans. Sales staff enter data (or not) about accounts in their sales funnel; human resources enter or review data attached to candidates or existing employees; staff manage their own data spreadsheets and when they find the application challenging and hard to work with, they avoid it. When humans enter data, hit save, submit or the like, it typically comes without a review of accuracy.
Data in the upstream then becomes fairly polluted which then carries the pollution into downstream lakes. Some would argue that the IT Department or the software platforms and toolsets should recognize the input errors and find some technical, AI-based fix to counter human error in faulty input or inability to critically assess the work or the work of others related to the collection and insertion of data. Once again, our own biases and desire to deflect accountabilities lead to misperceptions that then impact the quality of data — and ultimately trust in decision-making.
Data Quality Assessment and Data Masters
A complete approach and process to ensure data is reliable and trustworthy is quality control management. A formal organizational-wide QCM that sets forth the sources of data, the consistent and appropriate definitions and values of data, and constant assessment of accuracy, completeness and relevance becomes a critical foundational initiative to all organizations. Some organizations, including 2040, recommend a QCM framework process of a set of connected data masters (product, customer, etc.) that represent the concentric and connected circles of data attributes that come together to generate insight, meaning and value and correlate to performance, decisions and actions.
The QCM, whether managed by an individual or a team, should represent the upstream uses, definitions, purposes and collection points along with the business needs and intent. At the core of the QCM is recognition of when limited value or meaning exists, and recognition of the upstream influences that must be known and adjusted for.
- The Source
If the source is inaccurate, the data output will reflect that. If the source contains faults, misaligned programming or process logic or the upstream inputs generate significant and high-frequency errors, then the output is faulty and will result in the wrong insights. The compounding challenge comes when an organization seeks to level set the quality, value and completeness of data by using another data source to validate it. If the comparative source is as faulty as the first, one would hope the disconnect might be obvious. But in reality, internal systems may replicate the same faulty definitions, assigned values or errors. Any decision then only promulgates and replicates decision and action impacts across the organization.
- The Data You Need
We have made the point that technology and digital have afforded organizations the opportunity to collect masses of data from employees, business processes, customers. The applications, systems and toolsets we use day-to-day run our organizations. We like to horde and are reluctant to choose what is or isn’t important; we save everything for a rainy day. When loads of information and data exist, humans often do not want to expend the energy to critically assess what is or isn’t needed or deflect ownership as it “isn’t their responsibility.” A critical, foundational exercise is to define, identify and the data you need for understanding operational and market performance, and bridge it to the measurement of outcomes to understand who customers are in their relationship to the organization — and how the organization is performing in meeting their needs and wants. Related to this is the importance of life stage marketing and measuring what matters, both of which inform how data models, foundations, and collection are meaningful and relevant to managing and curating accurate and insightful connected data.
- Deep Cleanse
Regular data cleansing should be the responsibility of anyone in the organization responsible for the upstream data inputs. Those in the upstream are often the operations and business departments of an organization. Since they know the business most intimately, they are inputting and deriving insights and must take responsibility without deflecting the quality and/or cleansing of data to the IT Department or even the “data scientists or analytics engineers.”
- Ongoing Review
Maintaining data isn’t a once-and-done activity or exercise. It is an ongoing process that requires checks and balances which constantly reassess data’s definition and value. The past is helpful and indicative, but it reflects factors and variables that existed in context at that time. For example, how do we consider how our current economy deals with high inflation and returns to some version of normalcy after a pandemic and a world immersed in the Ukrainian War. The data that explained the past is no longer relevant, and using those analytics makes it hard to manage the present or predict the future. It’s true that measuring the performance of a new product or market against an earlier product and market will result in some form of data return, but the connections may be forced, out of context and out of line as the products aren’t the same, factors and variables are different, and the market may be completely unrelated.
Ensuring Data Currency and Maintaining Accuracy
1. Ensure Currency and Applicability
Organizations often have significant amounts of transactional data. This data represents past sales of products or services. Transactional data is considered historical data and may not indicate current or future performance or equate to the measurement of outcomes in the present. As such, additional data and analysis is required to recognize what and who comprised past performance. Updated data enables contextualization of current factors and variables. Only in leveraging the environmental conditions in the present, does past data become indicative or informative.
2. Pay Attention to Larger Patterns
Manu Bansal of the Forbes Technology Council states “When bad data hurts your organization, it is important not to assume that this is just an isolated, one-time event.” Where there is one error, there are often others, and the errors have been replicated across systems, applications and data stores.
Often bad data exists for a while and decisions have been made continuously on that data. When the bad data is revealed, individuals and teams become disillusioned since their own performance has been routinely measured via bad data. Reluctance and yes, fear, grow when the data correction is reported, knowing it may cause larger implications to the public or similar reporting that will trigger crisis communications strategies to manage stakeholder reactions. In certain situations when the correction is not brought to light, internally a revisionist approach takes over to justify the error. This only creates more challenges and complexity to gain and leverage accurate data for insights and measurement.
3. Know the Tools and Platforms You Need
The pace of change in data collection technological platforms and tools can be overwhelming as so many technology providers have added some element of machine learning and artificial intelligence to their offerings. When you don’t know the nature of the data you have and whether it is or isn’t of quality, defined or valued, how then can you make the best selection of tools or platforms that will help curate, manage and store the ever-increasing amounts of organizational data?
Understanding the data you have, how it is defined and how it needs to be used — as well as where it comes from and how it is ingested — is a critical viewpoint in selecting an appropriate tool. Remember, AI and machine learning are programmed by humans who may not have the same context of the organization you do. Plus, machine learning may have been programmed unrelated to who your customers are and what business you are in.
Data-Driven Decision Making
Making business decisions based on gut instinct can lead to some bad decisions. Data-driven decisions can make you more confident by using objective information that is logical and concrete. Yet making those decisions based on faulty data can lead to ruin. Stobierski writes, “Tie every decision back to the data. Whenever you’re presented with a decision, whether business-related or personal in nature, do your best to avoid relying on gut instinct or past behavior when determining a course of action. Instead, make a conscious effort to apply an analytical mindset.”
A data-driven decision-making process can make you more proactive. Stobierski adds, “Given enough practice and the right types and quantities of data, it’s possible to leverage it in a more proactive way—for example, by identifying business opportunities before your competition does, or by detecting threats before they grow too serious.”
One of the most positive outcomes of using data is to reduce expenses. Randy Bean, CEO of NewVantage Partners states “Big data is already being used to improve operational efficiency. And the ability to make informed decisions based on the very latest up-to-the-moment information is rapidly becoming the mainstream norm.”
Stobierski adds, “Data visualization is a huge part of the data analysis process. It’s nearly impossible to derive meaning from a table of numbers. By creating engaging visuals in the form of charts and graphs, you’ll be able to quickly identify trends and make conclusions about the data.”
Bad Business Decisions
The saying goes, hindsight is 20/40, but we say it’s really a revelation of blindsight. Ladders has identified a list of companies that went out of business based on remarkably bad business decisions. We have selected a few of them as examples of how the intelligent use of data might have produced entirely different outcomes. We suggest using these examples as cautionary tales of how not to let your own business fall into a similar trap.
“While A&W has never technically filed for bankruptcy, there are far fewer of them these days. The fast-food chain suffered a major loss in the 1980s because it ran a special promotion that consumers simply didn’t understand. A&W created a third-pound burger to compete with McDonald’s popular quarter pounder and offered it at the same price. But the burger didn’t sell, a fact that was absolutely flummoxing to the company.
“A focus group confirmed: Americans have no common sense. Because the number three is smaller than the number four, consumers thought they were getting a smaller burger when, in fact, the meal at A&W had more meat.”
We’re not so convinced this can be blamed on the American public. Any basic customer research would have revealed the potential problem. This is also an example of management hubris. One of our 2040 mantras is about the negative effects of inherent bias; just because you think an idea is great does not mean your customers – or even your workforce – would agree.
The collection of data about your customers is essential to ensure that your products and service are on track and aligned with what they expect from you.
“A subscription-based ticketing service for movies, MoviePass launched in 2011 and was developed into an app shortly thereafter. For a monthly fee, MoviePass allowed each subscriber to redeem three movie tickets and over time, new plan structures evolved with more benefits for the user. What happened was the unlimited options gave way to financial loss, as users went to more and more movies on MoviePass. Competitors were updating their rewards programs at record speed, also pricing out some of the lucrative options for users. In 2018, long before the pandemic threw movie attendance into disarray, MoviePass ran a deficit of millions and never recovered and went out of business in 2020.”
This use case is a goldmine for proving that data can provide customer insights to prevent a business model derailment. Analyzing expected customer behavior matched to financial results would seem obvious. Couple that with a comprehensive business model based on sound assumptions and MoviePass could have become the popular AMC Rewards ticket subscription service, not be eclipsed by it.
- Red Lobster
“In 2003, Red Lobster ran an extravagant promotion that eventually landed them in bankruptcy. That year, they enticed customers to come in and enjoy an all-you-can-eat snow crab experience for the low cost of $20. The problem? Snow crab is expensive and highly regulated by the government. While snow crab was under $5 per pound at the time, the restaurant underestimated consumers’ appetites. Diners came in and ate and ate and ate. In fact, they ate so much that the promotion cost Red Lobster over $1.1 million per month.”
This example of not using data and analytics to field test a new product is in retrospect a giant “what were you thinking?” The power of market research is its ability to flag a misjudgment or faulty assumption. That would have solved the disastrous loss leading promotion. But more fundamentally, running financial models to test a new product seems pretty basic.
Business books will be written about JC Penney and its fall from retail grace. It has had a revolving door of CEOs, rebranded and updated its stores, and even filed for bankruptcy. When whiz kid retailer Ron Johnson came from Apple to JC Penney, he brought with him a vision for a new retail model. He redesigned the stores into marketplaces, which could have worked. But he also got rid of the familiar pricing promotions to create an everyday fair-priced merchandise model. Because JCP stopped putting discount pricing on everything, customers thought they were no longer getting a deal. Sales came to a screeching halt, as did Mr. Johnson.
Johnson’s strategy neglected several key factors: market testing with customers and bringing management into his innovative thinking. Instead, he made assumptions about loyal customers and the workforce’s ability to pivot without being included in the decision-making process. This is another example of hubris and leadership with blinders believing they have the best ideas, the most progressive solutions and a monopoly on innovation.
Guard Your Data
At 2040, we have been pioneers in digital strategies. We have advised countless clients in the art and science of data and analytics, helping them develop the systems and processes that protect their data and avoid poor decision-making based on faulty input. Our goal is to optimize outcomes by ensuring output is accurate, reliable and relevant. Get in touch with us to audit whether your data may be tainted and to assist you to safeguard one of your most precious resources, your intrinsic data.
Get “The Truth about Transformation”
The 2040 construct to change and transformation. What’s the biggest reason organizations fail? They don’t honor, respect, and acknowledge the human factor. We have compiled a playbook for organizations of all sizes to consider all the elements that comprise change and we have included some provocative case studies that illustrate how transformation can quickly derail.