Archive for category Ottawa Events
I recently had the opportunity to act as a facilitator in a Business Analytics Experience workshop. The Business Analytics Experience, or BAE, is a business simulation where executives and analysts from the real world get to try their hand at running a fictitious company for a few hours with a focus on business analytics. Through a Cognos 10 interface, they have access to the company financials, pricing and marketing strategy and various other parts of the enterprise. The simulation runs for 4 fiscal quarters, and in each the participants get to choose how they want to proceed with their corporate strategy. Each quarter the numbers are run and then they can review their decisions and see if they are meeting their targets or not.
It is a fun, interactive way to see business analytics in action. Even though the simulation itself is somewhat simplified compared to real life, it does offer surprising depth and complexity that can include additional modules as time permits. It shows how insights gleaned from business analytics tools can be applied to real decision making.
If anyone is interested in learning more about the Business Analytics Experience or would like to get information about booking their own BAE session, please contact me and I would be happy to see what we can do for you.
I had the opportunity to hear Bill Inmon speak this week on a variety of subjects, including the proverbial divide in the business intelligence community between Inmon and Kimball data warehouse architectures. It was a very informative discussion, and even though I had read Inmon’s latest book I felt I came away with a better understanding of Inmon’s data warehouse philosophy and a better appreciation for what Inmon’s approach has to offer.
First and foremost, I found Inmon to be an engaging and down-to-earth speaker. I couldn’t help but notice that Inmon referred to Kimball by his first name, giving the sense that they are colleagues and perhaps even friends who respect each other’s work. He did say that some Kimballites he had spoken to over the years were very dogmatic and dismissive of his approach, but this did not seem to be a grudge he held personally against Ralph Kimball.
That said, Inmon highlighted the differences between the architectures first by pointing out the reason why each had been developed. Inmon noted that his architecture was designed to deal primarily with data integration across an enterprise, commonly referred to as the single version of the truth. Kimball’s architecture was designed to make reporting faster and easier, as indeed it does. Inmon pointed out that Kimball architecture tends to deliver a series of fact table-based data marts, joined by conformed dimensions to give a “data warehouse”. His approach is more holistic in the sense that an integrated data warehouse is built, and then data marts may follow.
A common misperception of Inmon’s architecture is that a data warehouse must be built in its entirety first. He said this is not so. An Inmon data warehouse can be built over time. He likened it to the growth of a city – you start out with certain districts and services and as the city grows the architecture of the city grows with it. You certainly don’t go out to build a complete city overnight; likewise with an enterprise data warehouse.
While it is true that Kimball’s approach is more practical and hands-on (and perhaps because of this many vendors have built data warehouse tools with Kimball architecture “baked-in”), Inmon did raise many valid and interesting points. His approach struck me as more enterprise-integration oriented as opposed to the almost ad-hoc nature of Kimball’s. I also found that I have followed some of Inmon’s approach without even realizing it – if you are archiving historical data, using an integrated staging area or enforcing “a single version of the truth”, you are to a certain degree following Inmon already. But of course if you have slowly-changing dimensions, star schemas and surrogate keys you are following Kimball too. Ultimately, Inmon said a hybrid approach is certainly a valid and viable option. Perhaps one day instead of seeing each as a competing architecture we will see each as a “tool set” we can draw upon, and the Kimball versus Inmon debate will finally be put to rest.
On October 1, 2012 legend/guru Bill Inmon spoke to the Ottawa data warehousing and BI community at an event organized by the local chapter of DAMA in conjunction with Coradix. Among other subjects, Mr. Inmon spoke at length on the idea of “Textual ETL”, a method for bringing semi-structured and unstructured data into the data warehouse, and making in available for analysis using conventional BI tools.
Mr. Inmon estimated that at least 80% of the data in an enterprise exists in this form – as emails, word documents, PDFs etc. – and he has spent almost a decade on the problem of organizing this data into a form that is queryable. The result is what he calls Textual ETL.
In essence this refers to a process for integrating the attributes of a text document (such as a contract) into a database structure that then enables query-based analysis. In the case of a contract, the document might contain certain key words that can be interpreted as significant, such as “Value” or “Royalties”. Rather than simply indexing the document, the Textual ETL process (which can contain over 160 different transformations) is designed to take unstructured documents and produce database tables that enable the user to create “SELECT”-style queries. In the case of a contract-type document, such queries might be to answer questions such as “find all the contracts that are of a value between X and Y that refer to product Z”.
A user with a system to manage such documents might have already added attributes such as “product” and “contract value” to the management system thus already enabling such queries, but the beauty of Textual ETL is that it enables the use of the application of taxonomies to documents to resolve the meanings of the texts themselves. This can extend to things like the resolution of things like synonyms. Mr. Inmon gave the example of texts (emails, for example) that refer to different brands of cars – Porche, Ford, and GM, say – or perhaps use the word “automobile”, but never use the word “car” explicitly. A well-designed textual ETL process would result in tables the allowed for ability to search for emails that refer to cars. It would do this by matching the brands of cars, or the word “automobile” to the word “car”, in effect appending “car” to the brands listed.
The process can be extended to dealing with documents where the same expression might mean very different things. Doctors may use similar, short expressions that mean different things depending on context. The application of Textual ETL to these kinds of documents would (must!) resolve these to different meanings.
The problems of implementing Textual ETL don’t seem trivial, and Mr. Inmon only presented a bare outline of how it is done. However, the implications for organizations that produce or deal with huge amounts of unstructured but critical texts – which is almost any organization of any size – could be considerable. In theory Textual ETL enables items that are thought of as not part of the normal domain of data warehousing to be brought into the data warehouse and subjected to the same kinds of analysis normally applied to such things as inventory levels, sales records and so forth.
Last night I attended a presentation hosted by OttawaSQL.net and presented by Matt Masson on the newly released SQL Server 2012. I was principally interested in the new SQL Server service Data Quality Services (DQS), the latest addition to the SQL Server lineup.
Boiled down to its simplest form, DQS acts as a data “spell checker” that can apply statistical data correction, user-defined knowledge base data correction, or third party web service data correction. In the DQS interface, users define Term Based Relations (TBR) rules which can be applied against the data set. While correcting data, DQS will generate its own list of rules which you can validate.
Another relative newcomer to the SQL Server lineup is Master Data Services (MDS), which first appeared in SQL Server 2008 R2. This serves as a central repository of “golden” records, the single source of validated truth. It allows for a Master Data model to be generated and kept. This then serves as a lookup source for Integration Services packages.
SQL Server Integration Services (SSIS) is still the linchpin of data movement in the SQL Server environment. It integrates seamlessly with the DQS and MDS services. It has seen feature enhancements and improvements such as a catalog feature (for ease of configuration, security and management), change data capture, and built in reports for troubleshooting and logging.
Anyone interested in a deeper understanding of SQL Server 2012 should note that Microsoft will be hosting a free day-long workshop on June 22 at their Ottawa office.
Wayne Eckerson is a noted BI consultant who spoke recently to the Ottawa TDWI chapter. I’d call Wayne a guru, but someone once told me that guru was a polite word for charlatan. Wayne is the very opposite – he is a very down-to-earth speaker who delivered a direct, unpretentious and thoughtful presentation on the subject of BI organizational architecture.
One of Wayne’s interesting observations was that he sees the need for what he calls “purple people” for any successful BI organization. If we think of people on the business side as “blue” and the people on the IT side as “red”, then “purple people” are people that have a mix of skills that enable them to be effective at bridging the gap between the two worlds. I spoke to Wayne afterwards and he elaborated on the idea:
“Purple people are a blend of business and IT – not blue in business or red in IT but a combination of both. These are both senior and junior level folks. At the senior level, some start in the business and end up in IT and then usually come back to the business where they run a business technology group that acts as an interface between the business and IT. (In the BI world I call these teams BOBI – business-oriented BI teams.) Some in IT become very conversant with the business and do a good job meeting business needs. These are directors of BI who interface with business executives more than their technical teams just about, to present budgets, roadmaps, funding requests, etc.
At the junior level, things are trickier, and not as effective. Most companies have business requirements analysts who interview business people, gather requirements, and translate those into specs for developers. I usually find there is a lot lost in translation with these junior level purple people.”
Another one of his key observations in the presentation was that from a BI architecture/organizational perspective, we can think of reporting as being a top-down process, with (we hope!) needs analysis, clearly defined specs, a process for building and moving data marts and reports into production, various controlling structures and so on.
Analysis, however, doesn’t really lend itself to this kind of approach – analysts may not know the questions that they want answered until they begin to delve into the data in a very ad-hoc kind of way. They want to quickly add data sources, join things together, and perform analysis that will lead to more questions, potentially the requirement for more data sources, and so on.
This leads to the business attempting to work around IT to get what they want, including bringing in tools that IT isn’t prepared to support. Analysis ends up being a volatile, bottom-up process, driven by the business, and the organization may struggle to keep it under control. IT fears chaos, but – to some degree – real analysis has a chaotic, or at least unpredictable, character. BI practice has to recognize the contrast in the very natures of reporting and analysis to be effective.
Wayne is a regular blogger and author of books and reports, such as Performance Dashboards: Measuring, Monitoring, and Managing Your Business. If you get the opportunity to hear Wayne speak take advantage of it – he delivers a lot of thought-provoking content that has application in the real world.
I came away from Ottawa Code Camp with an interesting tidbit – the XML data storage capabilities of SQL Server. Although this feature has existed since SQL Server 2005, this was the first time I had actually seen it demonstrated. As virtually the entire Cognos 8 world is XML driven, this has some interesting possibilities. The XML data type can force incoming data to fit its defined XML schema in much the same way a table structure does in a relational database, but it also will store any XML data when no schema is defined. Data can be extracted from the XML data type as pure data or raw XML through the use of XML Query (known as XQuery) and it can be manipulated with XML Data Manipulation Language (XML-DML).
Is anyone out there using this feature or is this a little known extra hiding away under SQL Server’s many other features?
I attended last night’s meeting of the Ottawa SQL.net Professional Association where we received a presentation of Microsoft SQL Server 2008 R2 by Damir Bersinic. R2 is SQL Server’s forthcoming release which should be available in May 2010.
When I got home, my wife asked me “Why didn’t they just call it SQL Server 2010?” Typically a version indicator like this marks a minor release, but it does sound a little strange. It reminds me of a story told by a fellow BI consultant a few years ago in which he was grilled by a U.S. customs agent for traveling to the United States for SQL Server 2005 training (“But it’s 2007!” the agent protested).
Despite the odd moniker, this release does have some interesting features particularly for security and Business Intelligence. Here are some of the highlights I noted:
Security: Building on the ability to encrypt columns in SQL Server 2008, it is possible with R2 to encrypt the database mdb and log files with what is referred to as “transparent data encryption”. Specifically, this is to prevent anyone walking off your business site with a USB key loaded with a copy of your entire database.
BI: With an emphasis on self service features, the new BI enhancements allow business users to bring large scale data sets directly into Excel with a tool called PowerPivot for Excel. Using in-memory storage, this allows you to manipulate millions of rows in Excel very quickly and easily.
SQL Server 2008 R2 Parallel Data Warehouse Edition: A SQL Server edition specifically designed for extremely large scale data warehouses, to hold tens or hundreds of terabits of data.
Efficiency: SQL Server 2008 R2 has further improved page compression and data compression, allowing more data per page. This reduces the number of reads and writes required, making database performance faster.
Master Data Services: A SharePoint based application that manages business definitions and rules with respect to your data. This is intended to simplify the management of these definitions, especially as they change over time.
Dashboards: There are now customizable dashboards for both DBAs and report administrators to monitor performance, resource utilization and other environment statistics.
The word is out that SharePoint will be the standard platform of the future Microsoft Business Intelligence. A SharePoint breakfast session will take place on April 22 in Ottawa, as well a number of other Canadian cities between April 19th and May 3rd.
It appears that in-memory storage/processing is becoming the new standard for Business Intelligence, as both IBM Cognos TM1 and Microsoft SQL Server PowerPivot for Excel are using it to great effect.
Yesterday I attended the first IBM Cognos Ottawa Users’ Group meeting held in more than 3 years. Following the presentations on emerging Cognos 8 technologies, there was a discussion of creating a steering committee for the group. It was agreed that the group should be neutral and user focused, and that it not be sales or IBM partner driven. As such, a number of IBM clients agreed to participate in the committee. So we should see more events upcoming in this forum, with a particular focus on IBM clients and their experiences with IBM Cognos products.
I will post additional information on future events as it becomes available. Anyone interested in participating in the steering committee of the IBM Cognos Ottawa Users’ Group can contact me at email@example.com.
I was at today’s presentation of MotioCI and my initial response is that every one of my Cognos clients needs to have this product. It offers a version control on your entire Cognos Business Intelligence environment, along with other key features such as regression and stress testing. As a consultant, I really like the idea of having a history of Cognos report changes – it gives you a credible record of what changes were made, when, and by whom. This can become a contentious issue when business users ask you where your report changes are, especially if they themselves accidentally overwrote them (reverting to a previous report version is also easily available in MotioCI).
I am looking forward to hearing about MotioCI training opportunities, which should be forthcoming. This is a powerful, extremely useful addition to the Cognos universe.
There are two interesting IBM Cognos events upcoming in the next two weeks. First, there is the IBM Cognos Performance Breakout Day which will be held on February 3, 2010. I am particularly interested in learning more about MotioCI and IBM Cognos Analytic Applications, both of which have featured presentations at this event. The second IBM Cognos event is the IBM Cognos Users’ Group meeting, to be held on February 11, 2010.
I will write up a summary on these events after the fact to share what I have learned.