Strong data-quality knowledge management system

In the third and final blog of my series about winning the DQ battle I'm going to look in more detail at something I mentioned in the last two blogs (1, 2), namely the DQ knowledge-management system. When data-quality becomes a key priority for a company this is one of the crucial tools to achieving success. It supports the continuous improvement of the data-quality of the reports and data-objects important for business users' decision making. But what exactly does it look like? And what are the key features that such a system needs to make it strong and reliable enough to achieve the goals being set for it?


What is a strong DQ knowledge-management system?

A strong DQ knowledge-management system must provide enough information to satisfy all the queries and needs of all the different types of users who are concerned about data-quality, namely:

  • business users who need information about the data-quality of reports and data-objects used for their decision making,

  • and all the users responsible for the data-quality within the company (DQ manager, data-stewards, developers and others).

That's a large and diverse set of users and when we say "enough information” we mean that the system should contain:

  • documentation about DQ checks,

  • the operational status of DQ checks (the results of the DQ checks that have been run),

  • and seamless access to the related data-objects' and reports' documentation,

which is quite a lot of information when you think about it.



What is important for who

There will always be data-quality incidents in the everyday life of any company. The crucial function of any DQ knowledge-management system is that it alerts business users when the quality of an object is insufficient and could lead to bad business decisions. So for business users the system needs to:

  • Clearly provide an up-to-date data-quality status for the data-objects & reports that interest them.

  • Give them a means to ask for information from data-stewards and the DQ manager, ask questions and provide feedback.

  • Be simple. If he/she needs training to use it, not only will it cost too much time and money to implement but simply- they won't use it.

The other key type of users are those who have some type of role or responsibility concerning the system's data-quality. Chiefly the DQ manager (see last week's blog for more info), data-stewards / data-analysts and the developers of the DQ checks.

The following list shows what each of these people needs to be able to do in the knowledge-management system:


Roles & Needs

DQ manager

  • Set and implement a DQ methodology that's valid for the whole corporation

  • Monitor DQ checks and results

  • Monitor projects and check on their DQ methodology compliance

  • Have a simple communication channel with the business users and top-management concerning DQ

  • Have an overview of what DQ checks are being implemented in the different systems

Data analysts / data stewards

  • Access the DQ checks documentation - see what DQ checks have been implemented, where and by whom, which data-objects they're related to, and what business rules are the basis of these DQ checks.

  • Access impacted data objects and their documentation.

  • Access the results and histories of the DQ checks - what's the result of each DQ check run - has there been a DQ incident in the checked data or is the quality acceptable? How often do incidents occur?

  • Access the details of all DQ checks.

  • See the relationships between DQ checks and business rules / business terms - what general business rule is applied for the purposes of the DQ check? Here I understand the business rule as a general principal and the DQ as its implementation in a specific context (e.g. rule = "VAT percentage is 21%" and check = "verify that all VAT percentages on customers' invoices are 21%").


  • design what DQ checks should be implemented.


What should it look like?

That's a lot of different needs - those responsible for data quality require completely different things to the business user. However, it's actually less complicated than it seems. We at Semanta have been developing Ency, a DQ knowledge-management system, for a number of years and we believe it works well at reconciling the different forces involved. This is what I'd suggest you look for when choosing such a system:

  • Communication, communication, communication: The system must be alive and social. There's no getting away from this requirement. The data stewards need to be able to use the system to explain what are the current DQ issues and how serious they are. The business users need to be able to read this information, check DQ statuses and provide feedback by reporting DQ issues they've found or asking questions. Whatever system you choose, it must have an easy to use communication channel between the two sides. We've designed Ency so that it makes use of the current social networking trends and tools and have found this very effective in getting new users engaged.

  • Simplicity. As I pointed out above, if the system requires a long training course just to get started, it won't be taken up and your DQ initiative will be over before it's begun. Look for an intuitive user-friendly interface. One which doesn't need a manual to navigate. We make Ency modifiable, so each business user can set up their own dashboard with a personalised overview of the data-objects & reports (s)he uses and their DQ status.

  • Accessibility. This sort of falls into the simplicity section but it's a key issue so I will set it as a separate point. The information any user is looking for needs to be easily accessible. Whether it's the DQ manager monitoring DQ check results or a business user wanting to know if he can trust the data in the latest churn report - finding that information should only take a couple of mouse clicks. With Ency we try and keep it to a maximum of three: one to open the system, one to find the page and a possible third to find the table.



The design of the DQ knowledge-management system is as important to the success of your DQ battle as the staff who are fighting it. It needs to be able to efficiently  support the data-quality improvement process within a complex system and satisfy the needs of several different types of user. For this reason it needs to be simple, flexible and adaptable. It must be a mandatory feature for all parts of the company and all the IT systems dealing with DQ. Clearly it needs not only a budget that's big enough to cover its introduction and operation, but also support from the company's top-management that's strong enough to get it used across the whole enterprise.

Read also the first part of this blog series: Winning the Data Quality Battle and the second part: Champions of data-quality management.