Using Synonyms and Views

October 18, 2008

Using Synonyms and Views to better handle the database level changes

This tip is equally applicable for OLTP scenario.

Synonyms or Alias

Synonym is like an alias for a table name at a physical level. The idea is that the query which an end-user tool fires to access the data, should be ideally using a synonym than the actual table name. A synonym (or alias) is just like the name of the book-mark in your internet browser, while the underlying link is pointing to the actual URL. You can maintain the same name of the book mark, while changing the underlying hyperlink.

Similarly, if you use synonym for providing access to the users and end-user tools, the advantage is that you can rename the underlying tables (as your dimensional model evolves), while still maintaining the same aliases or synonym. You can create synonyms for tables, views, stored procedure, function and some other objects. The idea is to have user-friendly synonyms and also to have a better change management.

Views

A view is a virtual table on which the end-user tools or users can run queries. The views make it easy and user-friendly access to the database. Views are typically a single table made out of multi-table, complex queries. For example, you may have a snow-flake schema, which you may convert into a star-schema by combining the dimensional table and its extension.  OR you may like to create a single virtual table combining the fact table and dimensional table.

Content of this page is referred from using synonyms and views from Business Intelligence and Performance Management Institute.

Don’t be guided only by the business requirements for your Business Intelligence

October 15, 2008

There is much talked about business requirements gathering phase in your BI initiatives. As part of this BR phase, you go to the users and ask for-

  1. Their information requirements- Reporting as well as analytics needs.
  2. Look at the current set of reports, dashboards and scorecards.
  3. Check on any additional potential information needs they might have

This phase will provide you with inputs for modeling and designing your BI platform. The tip here is that this input will not be the single driving factor for you BI plans. The reasons are that information requirements keep on changing. By the time you deliver the data warehouse as per the business requirements, you may realize that half of them have changed. Therefore, one has to model and design your BI environment, so that it can-

  1. Accommodate wider set of information needs, without changing the model
  2. Be Flexible in such a way so that you can add new dimensions and measures without re-writing your data-warehouse.
  3. Respond quickly to the changed information requirements.

This is achieved by following means

Educate your users on the different styles of business requirements

Instead of putting requirements only in form of exact reports and MIS (which should still be done), users can articulate their requirements in form of-

  1. Which dimensions they need to have the ‘cuts’ and reporting?
  2. Which measures they want to be reported?
  3. What are the business hierarchies they want to have their information on?
  4. What is the level of detail (mainly linked to a hierarchy level); they need to have the information on?

Users need to be made aware that once you have a schema, you can churn out million different reports or analytic views. The idea is to define that schema. Point to note is that we are not doing a holistic modeling here.

Use conformed and foundation dimensions

I think we have beaten this subject of foundation dimensions & measures to death (we can’t help it as it is core to your BI success). Even if users are not asking for all attributes of customer (say) dimension, you should have a holistic customer dimension from the day one and populate it as well. This becomes easy, if the entire data is coming from a single customer master.

NOTE- Foundation dimensions and measures are important, but if they are coming in the way of your first set of data-marts, you may skip them so to stay with-in the given investment boundaries. For example, if you have to make an ETL routine to pull a holistic customer data (to populate the foundation dimension) from five different sources, you may skip it, as sponsors will not buy this kind of spend for ‘future flexibility’. However, if you are creating an enterprise data warehouse, don’t skip the foundation dimensions and facts.

Use domain expertise to guess future information requirements.

There are some standard measures and reports which are used in a given industry or function. Even if your users have not asked for it, we would suggest that you should include it in your scope.

The content for this post is referred from the page don’t rely only on business requirements for BI modeling from Business Intelligence and Performance Management Institute.

Big-Bang Enterprise Data Warehouse is a pipe-dream

October 10, 2008

You might have seen product providers’ sales team as they walk into their clients board-room, open-up their presentations carrying the first slide on what we call a ‘dream DW architecture’. A nice assortment of polygons and arrows talk about the every possible data source being integrated into a central data warehouse, which is provider of accurate and reliable information. This is followed by the ‘ideal’ benefits of an enterprise data warehouse with statements like ‘right information at the right time at the right place’, ‘Get all what you need to know on a click of a button’…etc

What intriguing is that why service and product providers make this pitch, when they perhaps know best that building a true enterprise data warehouse in one-go is next to impossible for any medium or large size organization. Surely intent is positive! they might be selling the ‘point of arrival’, but at the same time, they need to be educating their clients that this POA may be many years away.
Let’s talk about what a big-bang Data-Warehouse may involve:

  1. Having all possible business dimensions
  2. Addressing all possible business themes
  3. Picking data from all source systems
  4. Creating an enterprise data warehouse metadata repository…etc…

You must refer Data warehouse has unique challenges, to understand uniqueness of DW initiatives.

  1. Fuzziness of business requirements
  2. Lack of focus from the users, as life can go on without DW.
  3. Unpredictable use of DW environment…etc…

Set a right foundation and go phase wise:

If you feel that what we are talking make sense, one can adopt million different ways on how to phase out a DW initiative. This depends a lot on your Organization Readiness, how much you have invested into DW initiatives already, the level of success (or failures), you have created in building DW etc…What we have seen in our experience is that:

  1. There are million different business intelligence ‘current states’
  2. There are generally similar issues related to BI. If we talk to 50 different companies on their BI issues, there is more than 75% chance that they will come out with the similar set of answers.
  3. The solutions to these similar problems can be very different depending on an organization.

One more aspect one has to be clear about it that BI point of arrival (POA) is generally the same (enterprise integration of data, virtualization of information, high-availability of information…), in terms of BI capabilities. However the actual architecture implementation could be very different. It depends a lot on the journey to POA and also on how large or complex an organization is. Very few architectures will be anywhere close to (even in POA) the dream DW architecture

Coming to DW Journey to Point of Arrival

My professional network keeps on asking me, if there are standard rules for designing your journey to the point of arrival? Here is my ‘very simple’ tentative answer:

  1. Start with stand-alone data marts, and let users taste the blood.
  2. Combine the data marts at a functional level. Fix your functional level agenda first.
  3. Move gradually towards an enterprise level integrated data warehouse. You can refer integrating data-marts

There is a  rider to the above approach. It is must that you create a central governance structure around the Business Intelligence. You may call it Business Intelligence Competency Centre or something else. This central practice can set common standards, disciplines, oversight to ensure that you are not creating the stuff, which you may have a hard time integrating later. Moreover, too many independent data-marts can overload your source systems, if they are not managed well. This central practice can help you a lot in terms of providing inputs on how to have efficient design etc…

The contents of this page are referred from Big-bang Enterprise Data Warehouse is a pipe-dream

Fix Business Intelligence at functional level first

October 9, 2008

Sometimes I am approached by the information managers, with a grouse that functional heads are simply not ready to give away their strangle-hold on their functional level information structure. That is considered as a huge barrier to building an enterprise level data warehouse. My first question to them is ‘why do you want to integrate?’ OR ‘what kind of information you need, which will use both business and financial information?’ OR ‘is there a pressing business need for the same?’.

I am not a heretic. A well integrated enterprise data-warehouse has significant benefits, and organizations should pursue the same relentlessly and with discipline. However, sometimes we fall so much in love with this end-objective, that we ignore dozens of milestones which need to be covered, before we achieve the same.

Organization’s today are trying to manage the conflict of de-centralization (for speed and localization) vs. having well coordinated integration for strategic alignment (for better expense management and consistency), in all spheres of their operations. The same conflict is well-manifested in the Data Warehouse world. The best way to manage it is to let the functional level BI agenda being fixed to a certain extent, before this is elevated to an enterprise data-warehouse level.

By letting the functional level BI to evolve, one can have the following benefits:

  1. Function feels the ownership of information management: The worst thing which you can do in pursuing your enterprise data warehouse dreams is to take-away the sense of business ownership for BI. Information is key to managers and that’s why they want their MIS team sitting next to them. A team, which not only generates the MIS but also, understands the business and day-to-day issues of that function. One needs to build the ground for enterprise data-warehouse by letting functions take the ownership and control of their BI agenda, while providing a strong support, oversight and best practices. (Namely through BICC-Business Intelligence Competency Centre)
  2. Let the functions understand and appreciate the BI challenges: BI is not easy and functional heads soon realize that they need expertise and help. The best way to get the functional involvement in an enterprise data-warehouse is when one builds the credibility and knowledge-leadership. This does not mean that the central information management team will come-in only when they are asked for. As you give functional level flexibility for the business to pursue their BI agenda, one will need to have a strong central governance (for example, through BICC)
  3. Achieve the integrated BI environment at functional level: A function can be an entity in itself. For example a products management function, could be having hundreds of products (or thousands of product variants), hundreds of pricing packages, IRR models etc…Similarly a large sales & distribution management function may have dozens of different channels, hundreds of different sales compensation schemes, thousands of locations etc…If one is able to have a robust functional level BI, which is built as per defined enterprise standards, you would have more than half the journey covered  for your enterprise BI agenda.
  4. Build the knowledge and skills: As a function pursues the BI agenda within their functional boundaries, super-users and super-designers emerge, who have solid business knowledge and they are battle-scarred veterans of building a BI from zero-base. As you start working on integrated the functional level BI to enterprise level BI, you will have cooked resources to drive the same.

The areas of central governance should ensure that business are deploying common tool-set (one needs to manage the licensing centrally), best practices for designing ETL, Data Warehouse project management, data quality management practices. You can find a lot of detail on this subject in our chapter on Business Intelligence Management and Business Intelligence competency centre.

The contents of this post are referred from Fix BI at functional level first from Business Intelligence and Performance Management Institute

Data Normalization is not the best approach in Data warehouse Dimensional modeling

August 19, 2008

If you have managed the design of transactional systems, you would have known that in-spite of all its stated benefits (like data integrity, clean design..), normalization is not always the best choice, as it sometime makes the design too complex and also negatively impacts the performance (due to complex Joins). The same principle applies in Data warehouse as well, and much more. As Data warehouse population is typically done once in a day and it is done through an extensive Extraction, Transformation and Loading routines with extensive DW data quality checks, the DW data is consistent and has integrity.
The fundamental requirement of a DW is the fast response time of any expected and unexpected query (Refer Data Warehouse v/s Transactional Systems ). Therefore we expect dimensional models to be more de-normalized (or totally de-normalized). Any joint can put a large overload for a large query.

Balancing Tips

  1. Avoid Extremes: It is advised to use a balanced normalization approach like star-schema or snow-flaked schema (in limited cases as it can take normalization to unacceptable levels). A single big table with facts and dimension attributes is another extreme, which one should avoid.
  2. Descriptive in snow-Flake: Descriptive (or what we call them as non-classification) attributes in a dimensional model are those attributes, which are not used for analytics, but mainly for reporting. You do not do any kind of derivation or modeling around these attributes. examples are – Address (not pin-code as that can be used), description and Name. You can make these attributes as part of snow flake, whereby they are not directly linked to the fact table (refer Dimensional model schema to understand these terms). The reason for keeping any attribute in a star-schema model is to enable fast query response time on large and/or ad-hoc queries.
  3. Different approach for aggregate vs. transactional schemas: It has been mentioned before that for the same set of data, you should not mix the detailed and summary level in the same star-schema. You should have a separate star-schema for detailed data and separate schema for summary data. In that kind of situation, the chances are that summary schema will be used for analytics and detailed data for enterprise reporting or when you want to do drill down from summary data. In that kind of situation you can have the detailed data in more of a normalized form (star-schema), and have the summary data in relatively de-normalized form (snow-flake). For example, if you have the sales revenue data in the aggregate form at sales agent level, you may keep it in star-schema. The same data at invoice level (transaction level) can be kept in the snow-flake form.

The text of this tip is picked from Business Intelligence and Performance Management Institute.

Maximizing Data Warehouse ROI and Usage?- Keep Most Granular Data

July 25, 2008

This is one of those common-sensical  approaches to get the maximum out of your Data Warehouse. As you make your investments in Data Warehouse environment, you can enhance your ROI, by using this environment for diverse applications. Having most granular (or detailed) transaction level data is core to broad-basing the Data Warehouse applications.

Traditional use of Data Warehouse environment for the purpose of back-room analytics, is no longer applicable. It can now serve as a single reference source for any of your BI related information needs. Some examples are:

  1. Summary Analytics
  2. Enterprise Reporting (needs transaction level data)
  3. Performance Management
  4. Data Mining (needs transaction level data-mostly)
  5. Business Modeling (sophisticated models need transaction level data)
  6. Operational BI like single customer view for telemarketers, customer support (needs transaction level data)
  7. Ad-hoc queries by operational staff (needs transaction level data)
  8. Root Cause investigation for issues, where you need to drill down to problem areas (needs transaction level data)
  9. Business Applications with Embedded BI modules (may need transaction data)

There are reasons for using Data Warehouse as a single reference information source. This helps you to:

  1. Maintain consistency: If your summary MIS/analysis and enterprise reports are coming from different sources, you will struggle to keep the numbers in synch.
  2. If your production data needs an offline fix (like standardizing customer and product IDs), its better to do that data-fix in one place. If you have separate enterprise reporting and analysis platforms, you will need to do that data transformation at two places, instead of one.
  3. Data Auditability: A single information reference point having detailed data will provide a good audit-trail of your summary transactions/analysis.
  4. ETL synergy: If you have diverse systems, and you want to have some level of information integration, its better to do it at one place. Doing ETL for summary data warehouse and a detailed reporting database, will almost double your efforts.
  5. Overall platform ease: You maintain only one information infrastructure (administration, scheduling, publishing, performance tuning…).
  6. Ease of Change Management: Any change in your information requirements, or changes in your source systems will be managed and done at one place.

Then the question is- If this approach is so good, why many companies use Data Warehouse only for summary data? Keeping granular data in data-warehouse has its own challenges and has its own demands:

  1. Brings forth the real issues with transactional data: In summary data warehouses, you can ignore some of the transaction level data issues and do some patch-work to ensure that aggregated data has a level of acceptable quality. Bringing in granular data, will need more incisive surgery on your data issues. This will extend the time of implementation.
  2. ETL efforts go up: This is related to the first point. Your key plumbing task in DW will become larger and more complex.
  3. Existing robust and stable reporting and querying platforms: Why to fix, which ain’t broken? etc…

My answer to the above reasons will be that while you can be flexible for functional level data marts, you should go for granular data from Day 1 for your enterprise data warehouse. If you are in a hurry, create a quick-fix data mart on the side. For an enterprise level data warehouse, you can start with few high-priority business themes, but ensure that they are designed for long-term usage (i.e. granular data).

You can refer field tips for BI and Data Warehouse in my portal Business Intelligence and Performance Management.

What is Metadata Management?

July 24, 2008

Dear Readers,

Generally my posts are complete by themselves. However, this is a long subject, and that’s why I am not able to write whole of it in my post. Please refer What is Metadata? in my portal Business Intelligence and Performance Management. I have tried to give the definition of Metadata in its practical form, as well as its differences from Data Warehouse, Business Intelligence and Master Data Management.

Please share your feedback and question you may have related to this page.

BI and Data Warehouse Requirements-Your Home Work

July 23, 2008

In this post, I assume that people who are gathering business intelligence requirements are business domain experts and knowledgeable dimensional modelers (logical and not physical techie modelers).

The challenges in gathering business intelligence requirements are:

Business folks are too busy for BI: In a transactional system, business owner has a fairly good idea on what functionality, control or feature they are looking for. This is because it is linked to their day-to-day operations and survival.  Life still can go on without BI for another day.

Business does not know what it wantsthere are too many choices: In OLTP, one has mostly linear and limited options, one can choose from. In case of BI, you can have hundred of different information requirements.

Crises Driven: As its difficult for business to pin-down what it wants from information perspective, the information requirements skew towards immediate need.

One way to manage this challenge is to do an intensive home-work, before you meet the business owner. You home work should include going through:

  1. Strategy blue-print including SWOT, competition land-scape etc..
  2. Enterprise Business plan
  3. Functional Business Plan
  4. Minutes of the management meetings
  5. Goal-sheets of the key stakeholders
  6. Current MIS and Reporting Packs

This will give you a good idea on the business priorities and issues. In other words, you will need to carry with you,  the business landscape at functional ( for which you are building a Data Mart) and at an enterprise level. It does not mean that you would be leading the interviewee. This will enable you a better questioning and conscious-keeping for comprehensiveness. The biggest advantage is to be able to related to the interviewee. If interviewee feels that you know his world, he will open up much more.

NOTE- the above home-work may not only be done by the interviwer but also for the interviewee. While the business user is aware of his world and priorities, going through the above-said documents will re-inforce and systematize his thinking.

For more details around the business requirements, you can refer Data Warehouse Business Requirements Gathering in my portal Business Intelligence and Performance Management Institute.

Universal Data Warehouse Dimensions- Is it possible?

July 22, 2008

I have talked a lot about having universal & foundation dimensions in the data warehouse in my portal Business Intelligence and Performance Management Institute. In brief, Data Warehouse is typically a dimensional model, which is different from relational/OLTP model. In dimensional model, you have dimensions and their attributes in their own tables linking through a central ‘Fact Table’, which carries all the numbers (or measures).

Now, the idea is that one should have one standard dimension (and attributes) for a business entity. Lets say that there are 20 star-schemas (or cubes in OLAP lingo) in a data warehouse, and ten out of them are using customer dimension. The concept of foundation dimension is that all these ten cubes should be using an identical customer dimension. This helps in doing analytics across multiple cubes and significantly reduces the change and development effort.

One reality check one has to take is that for large organization, it is extremely difficult to have a single customer dimension. This is because they may be operating very diverse markets or products. Getting all the business heads to agree to a single dimension may take years. Many of my clients have asked on if these ‘standard’ dimensions are practically possible.

My answer to them is that yes its possible, if we do apply the following techniques:

Create Super-Sets: Take a convenient path of creating ‘super-set’ dimension, which can absorb diverse entitites. For example creating a super set customer dimension, which can absorb retail, corporate and group-customers.

Create different foundation dimensions:  You do not need to create a single customer dimension for the whole organization. You can create ‘retail customer’ dimension, ‘corporate customer dimension’ etc… Both these dimensions will be different, but they will be universal within themselves. The assumption is that you will not be required to do cross-navigation across retail and corporate customers.

Start with high priority Dimensions: Instead of trying to make universal dimensions for all entities, identify top 5-6 which are most critical. Pareto principle equally applies here. The critical dimensions can be customer, vendor, product, location, sales channel etc…

There is more to BI than Data Warehouse

July 21, 2008

Data Warehouse and BI are considered synonymous. The reality is that Data Warehouse is one of the components of BI. There are many more components, which are needed to be in place for an actionable BI platform from an IT perspective. These elements are:

OLAP server– This is a multi-dimensional database, which provides extensive analytical capabilities. It sits between data-warehouse and the end-user tools. It picks data from Data Warehouse, summarize it and store it in its OLAP multi-dimensional database. The OLAP database is designed in a way that it helps analytics and other BI functions. OLAP has wide range of pre-built analytical functions, which can be used by users or application which are accessing it.

Enterprise Reporting– These set of tools, provide enterprise level (mostly scheduled) reporting. These tools take their data from OLAP or directly from Data Warehouse. OLAP typically has summary data. When you need detailed transaction level data, one will have to take it from Data Warehouse. In the past BI was typically used only for analysis. However, as Data Warehouse and OLAP combination is expanding its use, enterprise reporting tools have started using DW+OLAP as the source.

Query and Analytics Tools– These are the tools, which enable you to do wide range of analysis. This typically deals with summary data (you would not look for individual transactions in our analytics). The good part of many BI platforms is that they enable you to drill down to the transaction level, if you need to investigate into details. This means that in the back-end, you move from OLAP (summary) database to the detailed data warehouse database.

Performance Management Tools– This is the world of Dashboards, scorecards, setting standards, goals, reporting on the performance variance etc. This breed of tools, enable you to manage the functional or enterprise performance, and link it to analytics and enterprise reporting. Therefore, if you have a dip in performance, you can do analytics to find the root cause, and use reporting to list the transactions which are contributing to the same.

Data Mining Tools– While the above three end-user tools (reporting, analytics and performance management) are core to an organization’s BI capability, Data Mining is the next level of sophistication. These are knowledge discovery tools, which generate patterns, trends and co-relations on the data.

Business Modeling tool– These tools enable you to create models (like pricing models, actuarial models, business planning models, sales projection model…).

Therefore, when your vendor states that they offer BI solution, do check on what components are they offering. While there are end-to-end BI platforms like SAS and Business Objects, there are many vendors, which provide competitive individual components.

For details around end-to-end BI, you can refer my portal Business Intelligence and Performance Management Institute.