Archive for February, 2010
I finally read a book by Inmon (DW 2.0 Inmon et al) and must say I found it very interesting. As a long time adherent to Kimball’s star schema based data warehouse, I must admit to some bias when approaching Inmon’s work. But I did find Inmon’s philosophy gave me pause for thought. Inmon’s critique of the star schema based data warehouse does have valid points. It can be “brittle” as he puts it, resistant to business change. Unfortunately I did not find that he offered an alternative data modeling paradigm, at least not in this book, that would rectify this shortcoming.
Overall, to put it in terms of the proverbial forest and the trees, Kimball’s approach tells you how to properly fell a tree. Inmon’s approach attempts to explain forest management; it is vast and complex, intending to deal with vast and complex data. He describes the four sectors of the data warehouse which are intended to manage data volumes in the most efficient ways possible, based on the statistical likelihood of being called upon. It is difficult to envision this in action, but if it could be done it would be a fast and efficient system.
Kimball’s appeal to me is the simplicity. The star schema approach, like the SQL statement, is a simple concept that can be expanded to include vast complexity. The star schema also fits so perfectly with OLAP modeling it seems like they were made for each other. I think this has led to the wide adoption of the Kimball philosophy.
I have found that business changes can be incorporated easily within a star schema based data warehouse, as long as they are new dimension elements or measures. Difficulty ensues when new fact tables are required, over and over again, especially when granularity becomes an issue. Like an ever expanding puzzle, Kimball’s approach allows additional facts to fit in the model as needed, so additional business requirements can be accommodated. However, if particularly voluminous data, unwieldy data structure or enterprise-wide standard measures across a huge corporation are concerns, Inmon’s approach may certainly be helpful.
A business user recently expressed surprise that my data warehouse loading process would reject dirty data. His process didn’t and our numbers didn’t match. But he was running a simple query – no check for code integrity against control tables – so his data set included invalid results.
In a Kimball-style star schema using surrogate keys, data integrity is forced by lookups. In fact you can’t really fudge it without making up new dimension keys out of thin air. And isn’t that worse? I say it is better to reject data and report it in another location than to force non-conforming data to conform. Unless specific business rules have been provided on how to clean dirty data, your best choice is to reject and report.
Yesterday I attended the first IBM Cognos Ottawa Users’ Group meeting held in more than 3 years. Following the presentations on emerging Cognos 8 technologies, there was a discussion of creating a steering committee for the group. It was agreed that the group should be neutral and user focused, and that it not be sales or IBM partner driven. As such, a number of IBM clients agreed to participate in the committee. So we should see more events upcoming in this forum, with a particular focus on IBM clients and their experiences with IBM Cognos products.
I will post additional information on future events as it becomes available. Anyone interested in participating in the steering committee of the IBM Cognos Ottawa Users’ Group can contact me at firstname.lastname@example.org.
I was at today’s presentation of MotioCI and my initial response is that every one of my Cognos clients needs to have this product. It offers a version control on your entire Cognos Business Intelligence environment, along with other key features such as regression and stress testing. As a consultant, I really like the idea of having a history of Cognos report changes – it gives you a credible record of what changes were made, when, and by whom. This can become a contentious issue when business users ask you where your report changes are, especially if they themselves accidentally overwrote them (reverting to a previous report version is also easily available in MotioCI).
I am looking forward to hearing about MotioCI training opportunities, which should be forthcoming. This is a powerful, extremely useful addition to the Cognos universe.