Maximizing Data Warehouse ROI and Usage?- Keep Most Granular Data

By rajgupta

This is one of those common-sensical  approaches to get the maximum out of your Data Warehouse. As you make your investments in Data Warehouse environment, you can enhance your ROI, by using this environment for diverse applications. Having most granular (or detailed) transaction level data is core to broad-basing the Data Warehouse applications.

Traditional use of Data Warehouse environment for the purpose of back-room analytics, is no longer applicable. It can now serve as a single reference source for any of your BI related information needs. Some examples are:

  1. Summary Analytics
  2. Enterprise Reporting (needs transaction level data)
  3. Performance Management
  4. Data Mining (needs transaction level data-mostly)
  5. Business Modeling (sophisticated models need transaction level data)
  6. Operational BI like single customer view for telemarketers, customer support (needs transaction level data)
  7. Ad-hoc queries by operational staff (needs transaction level data)
  8. Root Cause investigation for issues, where you need to drill down to problem areas (needs transaction level data)
  9. Business Applications with Embedded BI modules (may need transaction data)

There are reasons for using Data Warehouse as a single reference information source. This helps you to:

  1. Maintain consistency: If your summary MIS/analysis and enterprise reports are coming from different sources, you will struggle to keep the numbers in synch.
  2. If your production data needs an offline fix (like standardizing customer and product IDs), its better to do that data-fix in one place. If you have separate enterprise reporting and analysis platforms, you will need to do that data transformation at two places, instead of one.
  3. Data Auditability: A single information reference point having detailed data will provide a good audit-trail of your summary transactions/analysis.
  4. ETL synergy: If you have diverse systems, and you want to have some level of information integration, its better to do it at one place. Doing ETL for summary data warehouse and a detailed reporting database, will almost double your efforts.
  5. Overall platform ease: You maintain only one information infrastructure (administration, scheduling, publishing, performance tuning…).
  6. Ease of Change Management: Any change in your information requirements, or changes in your source systems will be managed and done at one place.

Then the question is- If this approach is so good, why many companies use Data Warehouse only for summary data? Keeping granular data in data-warehouse has its own challenges and has its own demands:

  1. Brings forth the real issues with transactional data: In summary data warehouses, you can ignore some of the transaction level data issues and do some patch-work to ensure that aggregated data has a level of acceptable quality. Bringing in granular data, will need more incisive surgery on your data issues. This will extend the time of implementation.
  2. ETL efforts go up: This is related to the first point. Your key plumbing task in DW will become larger and more complex.
  3. Existing robust and stable reporting and querying platforms: Why to fix, which ain’t broken? etc…

My answer to the above reasons will be that while you can be flexible for functional level data marts, you should go for granular data from Day 1 for your enterprise data warehouse. If you are in a hurry, create a quick-fix data mart on the side. For an enterprise level data warehouse, you can start with few high-priority business themes, but ensure that they are designed for long-term usage (i.e. granular data).

You can refer field tips for BI and Data Warehouse in my portal Business Intelligence and Performance Management.

Leave a Reply