Forty-four percent of companies don’t have a formal data governance policy, and 22% of firms without a data policy have no plans to implement one.
Older research shows that one of the big problems is that nobody knows who is supposed to be responsible for maintaining the accuracy of data. Business people overwhelmingly think that it’s IT – but IT organizations typically don’t have much control over the business processes that lead to poor data in the first place.
Fixing data is essential – but hard
The best dashboards and reports in the world are useless if you can’t trust the data – you’re just putting lipstick on a pig.
Here are some of the key best practices for improving data quality:
Investigate. First monitor the existing data quality (data profiling products like SAP Information Steward can be a big help here). Follow the data chain to the final users, and find all the examples and anecdotes you can about the business problems poor data quality has caused. Find examples of risks that have resulted from bad data, or any news articles about industry or competitor problems in this area. Call it data governance, and see if you can ride existing governance and compliance processes inside the organization.
Get people to care. This is the hardest part. You have to get people to feel the pain. IT organizations have a tendency to hide bad data, since they rightfully worry that it will lower the credibility of the reporting systems. But unless the data is shared, nobody will be aware of the problem, and nothing will be done (and they’ll blame IT). The trick is to provide the data, but make sure it’s clearly labeled as suspect, and make sure that there’s a link to more information about why the data is bad.
It’s all about money. Nobody cares about bad data – they care about what it’s doing to profits. Put a dollar amount on the problem. It doesn’t have to be anything complex initially – just do back-of-the-envelope calculations to see if it’s worth doing something about it. If it is, share your calculations with finance and other teams – they’ll be happy to point out your mistakes, and it might get them thinking and talking.
Create a team of data stewards. Business people think data quality is an IT problem, while IT people know better. Get a group of people together to fix the issues. Take care to set expectations correctly – it will be more complex and take longer than anybody expects. Start with the “slow, fat rabbits’” – easily fixed problems – and heavily publicize the benefits. Use the goodwill generated to tackle the harder problems.
Stop bad data getting in. It’s much cheaper to stop bad data from getting into your systems than it is to clean it up afterwards. Invest in a “data quality firewall” that checks for bad or duplicate data as it’s being entered into your operational systems (for example, SAP Data Services integrates tightly with SAP and other operational systems).
Never stop cleansing. Data degrades over time (especially customer data). You need a long-term approach to detecting, monitoring, and fixing data quality, and it needs to be made the clear responsibility of an internal team, such as a BI competency center.
More required reading
Here are some posts that might cheer you up before you tackle your data quality issues:
Brushes image by Bright_Tai