How to support the process of data warehouse decommissioning?

How many of us have met a colleague in the hallway and started cursing the functionality of the company's data warehouse and the unsatisfactory state of reporting? Let's face it, people like to grumble and with data warehouses, there really is something for everyone. To be fair, the criticisms aren't always justified e.g. due to certain fundamental societal and technological changes, businesses have sometimes been left with no choice but to build new data warehouses. However, most of the causes of failure and frustration lie elsewhere.

The decision to introduce a new data warehouse is usually accompanied by the desire to decommission the old, non-compliant one. At this point, it is a strategic decision, often with a political context, which has deep consequences for many parties and departments. The arguments for such a large project, involving thousands of man-days, are mostly the same:

  • Neither the data warehouse nor the reporting system meet users' needs - the market and business orientation have shifted and the data warehouse is no longer able to deliver the required data in the necessary detail and time.

  • The data warehouse has lost its credibility because users don't trust the data and no one knows what's inside it.



Don't cure the symptoms - cure the disease


In his article "How Do Data Warehouses Die?", my colleague, Peter Hora, describes how data warehouses can go feral, breaking free of their chains. Many companies balk at DW's in this state and either try to drug them (new hardware, better governance) or get rid of them entirely  - terminating them and birthing a new one. But that's only a cure for the symptoms and is nothing like a cure for the disease.

In our opinion and experience, by introducing a meaningful metadata layer and improving communication with business users, many (not all) of the causes of a faulty data warehouse can be fixed. Which is why, before deciding to decommission a warehouse, we would recommend deploying Encyclopaedia, focusing on the quality of documentation, setting up good data governance practices and let the current system revitalize itself naturally.

However, if the decision to create a new one and decommission the old has already been taken, we can offer you some practical advice.




Decommissioning in practice


One of our customers in Telco came to just this decision. They started the huge project of creating a new data warehouse and reporting platform, while at the same time running a staged decommissioning of the old one.

The goal of the first stage was to create an all-encompassing Information Touchpoint. It had to contain the complete metadata documentation from both the new as well as the old warehouse, let users know what the BI department was producing and making, provide status updates for individual projects and establish a communication hub for IT, business users and vendors (suppliers/partners). A tough call but in all honesty what was required before the project began was just as tough, if not tougher -  documentation of the whole of the old DW.

Our client deployed Encyclopaedia (Semanta's wiki metadata platform)  and I'm happy to say it successfully covered most of our client's needs. After roll-out we witnessed a change in both user and BI behaviour. Discussions about documentation started taking place between both owners and consumers. Together with business users, terms and reports began to be defined, data models were illustrated and the relationships between all these objects visualised.

However, none of this would have happened if the customer hadn't followed through our initial recommendation of documenting and analysing its old DW thoroughly before deploying the new one. Document an old DW? That's sounds as hard as the goals that were set for Ency. How should we go about doing that? It's not easy but I can give you some practical advice from lessons we've learnt through experience.



So, how does it work?


1 ) Create a list of existing reports

Do you have any idea how difficult the question "How many reports do you have in your company?" is? Individual teams prepared lists of reports but the fact is, many more were found in the grey zone - unofficial reports of users who no one knew, but which were used more often than the official ones!


2 ) Assign owners to reports

It's important to define a business person as the owner of each report. A report needs someone to take responsibility for it. Every report also needs a guarantor - the man or woman who created it and knows all about it and the relationships it possesses.

3 ) The group of consumers

Draw up a list of users who use the reports for their work. Discussions need to be held with them about their needs and the future shape of the reports.

4) Classification (categorisation) of the reports

Sorting reports into those that you definitely want to discard, those you want to transfer to the new DW, and those that need to be analyzed further in terms of workload.

5) Reverse-engineering the old data warehouse

The old data warehouse might be very poorly documented. It could well be necessary to employ dozens of analysts to perform a gap analysis and identify what is already in the new DW and what still needs to be transferred.

Within the Encyclopaedia project I mentioned above, we were able to reverse-engineer the old data warehouse data model a little more simply. Expensive software licenses or deep technical knowledge weren't necessary to set-up all users with online access to a searchable list of all data items. If a data model can be reverse-engineered into PowerDesigner, it's a simple step to publish it in Enyclopaedia. Then the whole inventory of the data warehouse can be accessed by everyone via a web interface - not only analysts, but also business users. The structure is identical 1:1, is as detailed as the warehouse, generates automatically and it saves not only time, but money too :)

6) Mapping reports to data sources

Selected reports need to be mapped to their data sources (database / admin / tables) so impact analyses can be performed on them, allowing their links and relationships to be displayed.


7) Communication

Many stakeholders and teams were involved in the decommissioning process. Communication between them had to be good or the project wouldn't have been successful. In the example above, collaboration took place in Encyclopaedia, which not only facilitates communication but also ensures the public visibility of agreements.


Recommendations when decommissioning

We believe that the correct, complete, and for the most-part automated documentation of all data inputs and outputs can maximize the effectiveness of the work of all stakeholders (more in "BI e-shop – how to sell BI outputs to business users"). The above recommendations are just the beginning of a data warehouse's decommissioning process but they bring about fast, effective and visible benefits:

  • transparency throughout BI

  • a change in the attitude and commitment of business units and the opening up of communication

  • changing many BI employees' work in a meaningful way - removing duplication

  • immediately establishing basic governance - see a "Data Governance - A few steps to take you from Zero"


You should start with the documenting, not coding!

Good luck!