Data Quality: The Impact of Poor Data Quality
Data quality is a measure of how well your data meets or exceeds the criteria set for it. Unfortunately, many organizations suffer from poor quality data, what many refer to as data technical debt or simply data debt.
Why Should You Be Concerned About Poor Data Quality?
Data technical debt has a measurable impact on organizations, such as:
- Gartner estimated that the average annual cost of poor data quality to organizations was $12.9 million (2021).
- Accenture found that only one third of executives trust their data sufficiently to derive value from it (2019).
- Thomas Redman estimated that most organizations lose between 15-25% of revenue due to bad data (2017).
- IBM estimated that $3.1 Trillion (yes, Trillion with a T) of American GDP was lost annually due to poor data quality (2016).
- The New York Times reported that data scientists spend 50-80% of their time data wrangling (2014).
How Does Poor Data Quality Impact Your Organization?
Data technical debt is a serious problem for several reasons:
- Longer time to market for new customer offerings. It is much more difficult to work with low-quality data than high-quality data. This is due to increased effort to understand the data and then to fix it as best as possible to make it more usable.
- Increased cost to work with low-quality data. The increased time to work with lower-quality data results in increased cost to do so.
- Unpredictability of schedules and cost. Because most data technical debt is hidden it becomes difficult to predict how much effort it will be to work with, and to evolve, existing data. This is because you often do not know how big the mess really is until you at least investigate the situation.
- Poor decision support. The quality of your ability to make good decisions is directly impacted by the quality of the data available to you. Garbage in, garbage out (GIGO).
- Poor quality artificial intelligence (AI). Similarly, the quality of your AI systems is directly influenced by the quality of the data used to train them. Once again, GIGO.
- Decreased collaboration. An indirect problem with poor quality data is that it can decrease collaboration between teams, which is unfortunate because data technical debt often requires cross-functional collaboration to remove. This decreased collaboration is often the result of “finger pointing,” the developers didn’t work with the source of record, the data people were too hard to work with, this team didn’t keep the documentation up to date, and so on.
The quality of your data has a material impact on your organization. This is why the Agile Data (AD) method advises you to be “quality infected” and to ensure the high data quality within your organization.
Related Reading in Data Quality
- The Agile Database Techniques Stack
- Data Quality Techniques
- Data Technical Debt
- Metaphor for Data Quality: Data is the New Water