Data quality and business metadata combined?

I'll spare you the IT gibberish this blog, instead I'm going to look at three examples that show how business metadata and data quality checks combined together, provide more value than if they remain separate:



Let's begin with a business question: Do we have enough consumer client data at a high enough quality to do a direct-mail campaign for young adults?

Let's look up "consumer client" in a business dictionary which allows data quality checks - how is "consumer client" defined and what kind of data quality status does it have (e.g. a 76% quality score)? Let's dig in and find out what other terms are related and influence "consumer client" e.g. the business terms, "client's demographics" (65%) and "client's contacts" (84%).

Looking at this information we can see that we might have trouble with segmentation, but from the data quality status we can see that we will definitely be able to contact clients effectively.

If a manager were to ask a data steward to focus on a client's demographics, s/he would actually dive even deeper than just the business term, down to the underlying data structures (tables, columns, related business or data quality rules).

Data quality in a business dictionary makes data quality indicators tangible - it links them to concepts that have value in the eyes of business people, because they can more easily understand them and relate them to their day-to-day business.

Ataccama Data Quality Dashboard combined with Air for Web and business definition from Business Dictionary.



Another business question might be: Can I trust the numbers in this churn report? Typically, reports in telco use a churn business term that is defined as changes to the subscription base in several dimensions. The most important thing is that people actually trust this number and spend their meetings discussing how to improve it and not whether SIMs suspended because of collections should or shouldn't be included. If business definitions are being managed properly, then the discussion will be focused where it should be, not on the validity of the data.

The transparency of relationships between the business terms used in a report and its underlying data tables, combined with data quality checks, validates the correctness, timeliness and completeness of the data used in the report. This creates trust in where the reports are coming from.

Data quality information combined with relationships between report and data sources presented in Report Catalog.

Feedback loop

How to do this though? The problem is someone must actually DO things (write definitions, correct errors …) to set up a functional metadata/data quality process and this always brings with it problems of adoption, motivation and accessibility.

This is why data ownership, for me, is the cornerstone of any metadata or data quality initiative and lowering data-quality user-adoption barriers as much as possible is the key to user-friendly data-quality for the masses.

Metadata tools like the business dictionary, report catalogue and data dictionary help to give data ownership a tangible form. Anyone in a company can look up who is responsible for an information asset. But a user also needs to be able to give feedback or ask for information easily, directly in the application s/he works with. If such a catalogue is integrated into a reporting platform or Excel and a user can simply ask a question or report a data quality issue directly from the working context, without needing to jump through hoops to do it, or actually having to know "how" to do it, that would surely bring data ownership and feedback a step closer.

Ask a question or report data quality issue directly from a report using Air for Web.



Business metadata can provide business users with data quality indicators that they can understand and see a value for, in their everyday working life. Data quality is a natural extension of business dictionaries, report catalogues and data dictionaries. It provides users with the vital information about whether the content their reading is up-to-date, can be trusted, and used with confidence.

IMHO, one way to get business folks involved in data quality and participating in a data quality initiative, is to make them understand this. Show them its value.