While we are consistently creating data and analyzing it to use in our businesses, we often forget to reflect on the value of the data that is being collected. Businesses are aware of the data quality issues throughout their organizations, but many face the challenge of how to tackle it, unaware of simple steps that can be taken to ensure data quality, the first being data governance.
In a recent survey, over 60% of organizations indicated that their top data quality worries were too many data sources and inconsistent data. This tells us that more needs to be done to ensure that organizations are not overwhelmed with the data they have and are aware of how to handle it.
Maybe instead of looking at the number of data sources as a problem, they should look at it as a benefit that technology has progressed to match the organizations’ needs. Front-end tools generate metadata and capture provenance, then data cataloging software manages the disparity in sourcing.
We do, however, have to continue to push a cultural change around data, encouraging people throughout the organization to ensure data quality, governance and general data literacy.
Despite the sheer amount of data being a top concern, it will be hard for any organization to reduce the number of data sources it has. If anything, this is only likely to increase over time. Some other common data quality issues point to larger, institutional problems. Disorganized data stores and lack of metadata are fundamentally a governance issue and, with only 20% of respondents saying their organizations publish information on data provenance and lineage, very few have adequate governance.
Poor data quality controls at data entry are fundamentally where this problem originates. As any good data scientist knows, entry issues are persistent and widespread. Adding to this, practitioners may have little or no control over providers of third-party data, so missing data will always be an issue.
Data governance, like data quality, is fundamentally a sociotechnical problem, and as much as machine learning and artificial intelligence (AI) can help, the right people and processes need to be in place to truly make it happen.
Melody K. Smith
Sponsored by Access Innovations, the world leader in thesaurus, ontology, and taxonomy creation and metadata application.