Archive for category Business Intelligence
Have you ever struggled to align your BI strategy with your strategic business plan? Are you seeking best of breed knowledge from across the performance management industry? Do you need an assessment of your performance management process? In short, are you looking for better BI strategy across your enterprise?
My friend and colleague Mike Baggott has a new series of seminars on strategic performance management that may be able to help. He is a 30 year veteran of business intelligence and the former directory of product management at Cognos. His seminars intend to merge business strategy with meaningful business intelligence, resulting in business-driven BI. Mike has drawn on a deep well of BI knowledge in preparing this, not only from his own career but also drawing on leading practitioners from across the information industry like Roland Mosimann, Stephen Few, Wayne Eckerson, Bill Inmon and many others.
These seminars focus on the elements of a successful BI approach and the alignment of strategic business goals with BI strategy. His material is technology neutral and does not pertain only to Cognos solutions.
His seminar outline consists of the following modules:
Business Driven BI planning
Management Overview (1 1/2 hours)
Management – Analytic Assets Inventory and Gap Analysis (1 hour)
Business Driven BI Requirements Analysis and Design
Business and IT – Analytic Content (1 day)
Business and IT – Advanced Analytic Content (1 day)
Business and IT – Agile Requirements Gathering & Design (1/2 day)
Business and IT – Governance and Compliance Access Control (1/2 day)
If you have any interest in these seminars you can contact Mike directly through his LinkedIn profile.
Last evening I attended a presentation on Data Quality Services (DQS) by Microsoft BI specialist Stéphane Fréchette. This new feature is available in SQL Server 2012, and is part of the the Enterprise Infomation Management stack of MS BI which includes DQS, Master Data Services (MDS) and the familiar Integration Services (SSIS).
First and foremost, DQS is intended as a business tool. DQS is designed for a data steward (typically a business user) who wants to define rules for cleansing data. It allows users to create a knowledge base of these rules, and it allows users to cleanse their data and to match duplicated data. This is a two step process that always starts with cleansing and then, if necessary, moves on to matching. DQS never overwrites data but instead writes data out to target destinations.
DQS’s strength is in the cleansing and matching, its two core functions. It does not appear to offer much in terms of data profiling, although it does allow you to review data as it is being cleansed and matched. SSIS is able to call DQS cleansing operations, but not matching operations. SSIS is able to deduplicate data through its own fuzzy matching feature or through the Master Data Services (MDS) operations.
DQS does have a few limitations. This first release is only able to read and write to Microsoft Excel and SQL Server 2012. Already noted, SSIS cannot make use of the matching operations. While it is an interesting foray into the realm of applied data governance, its heavy focus on cleansing makes it feel more tactical and less strategic. It will be an interesting addition to the overall SQL Server universe.
(Special “The Colour and The Shape” edition)
In Part 1 of this series we introduced the Esri Maps for Cognos (EM4C) product, which enables us to tie together BI-type reporting with the rich capabilities of Esri’s mapping software. In Part 2 we demonstrated how easy it is to use this software to connect point-type
data in a Cognos report to a map. In essence, we can take points of data identified by lat/long values and connect them to a map, and then colour-code the points to represent different categories or types of data. In our example, we looked at crime data for San Francisco. The result enabled the user to make inferences from the geographic distribution and type of crime reports that would be difficult to make if the data were simply listed by address, or even grouped into neighbourhood categories.
In this installment, we will look at a slightly different way of displaying data within the context of geography – instead of displaying discrete points (which require lat/long values) we will categorize larger geographic areas, defined by shapes on the map.
Note that in this example we don’t have any “lat/long” type data here – instead, we have Retailer Province-State, which contains values representing the name of each state:
This time, instead of adding a Cognos X/Y Layer to our Esri map in the report, we will add a Cognos Shape Layer:
A Cognos Shape Layer acts similar to a XY layer, except that it binds data based on common descriptions between the report data to a map containing a “shape”, instead of lat/long points. In this case we set up the map associated with the “shape” layer to one containing shapes of States in the US. In the wizard provided we can match the shape names in the map we have selected (STATE_NAME) to the appropriate column (Retailer Province-State) in our query:
We select the measures we are interested in…
… and then configure the “shape join”, assigning colour-values to relative levels of each measure (in this case, Revenue):
We now have a map that lets us see, by quantile, how revenue compares by state:
For example, here is the map showing Gross Profit:
Note that the legend shows the quantile breakdowns for each colour. As well, hovering over each state brings up information on the state:
Users are not limited to a single shape layer – multiple layers can be combined on a single map, and then the layers activated/deactivated by the user to how different data by different “shape”.
Shapes are not limited to conventional maps, of course. Floor plans provide an ideal source of shapes. Retailers can use shapes to identify revenue by area of a store, or property managers can look at building usage, perhaps over time. All that is needed is a Esri map with shapes that correspond to the physical areas the user is interested in, and have an attribute that can be joined to a column in the report that contains values that match the values of the attribute.
(Special “If You’re Going To San Francisco” edition)
In part 1 of this series, we looked at how Esri Maps For Cognos – EM4C – allows us to embed a map from an Esri map server inside a Reoprt Studio report. But the map is pretty useless if it doesn’t allow us to connect to our data and perform some kind of analysis that can’t be done with a regular list report, or with some kind of graph.
From a mapping perspective there are a couple of concepts that we need to keep in mind if we are going to bind business data to a map: one is the idea of a point, the other the idea of a shape.
We’ll start with a point. A point is a lat/long value on a map: it is (strictly speaking) an entity with no area. It could be a point that represents a store location, a home address, whatever you like. The important thing to keep in mind is that even if a store (or your house) occupies area, from a mapping/point perspective it is simply a point on the map.
So what kind of data can we plot using points? Crime data is one example – a police call is typically to a particular address. If we can plot these locations on a map, by type, we might gain insights into what kinds of crimes are being reported not just by location, but by location relatively to each other – what kinds of crimes cluster together, geographically.
Crime data for San Francisco for March, 2012 is available on the web, and this data set comes with both category of crime and lat/long of the police report. This makes the data set ideal for plotting on a map.
First, I set up a quick Framework Manager model that retrieves the data from my database. Then, we need a query in Report Studio that retrieves the data:
Note that we have a Category, Description, and X and Y values representing Longitude and Latitude respectively.
I add a map placeholder (as we did in Part 1) and then save the report. (I could, of course, add any additional report items, queries etc to the report that I wish.) I then open the map placeholder in Esri Maps Designer, add a base map, and then add a new layer: the special Cognos X Y Layer. I rename it Crime_Locations:
A wizard enables me to select the query associated with the Crime_Locations layer, which will display points:
Note the inclusion of a Unique Field – this is the IncidentNum from the original data.
Further configuration allows me to then assign the Lat/Long from the data set, and identify each point by the Category of crime.
I now have a set of symbols – coloured squares – that correspond with the categories of my data. When I view my report, I can see the location of each crime, by colour-coded type, at each location it was reported at:
Even at this zoom level I can draw some conclusions about what areas have more crime – the north-east seems to have more reports that the south-east, for example. But by selection of specific crimes, and zooming in, interesting patterns begin to emerge.
The orange squares represent drug-related charges. The green and purple squares are assault and robbery charges respectively. The drug-related charges are more concentrated in one relatively small area, while the assault and robbery charges seem more spread out – but with a concentration of them in the area the drug charges are also being laid.
If we zoom in even closer, we can see that certain streets and corners have more calls than others in close proximity – that the crimes seem to cluster together:
But zooming out again, we see an interesting outlier – a rash of drug charges along one street, with what appears to be relatively few assaults or robberies:
Zooming in we see that this activity is almost completely confined to a 7-block stretch of Haight St., with virtually no activity in the surrounding area, and few robberies or assaults:
This kind of spatial relationship is extremely hard to discern from a list or chart, even a chart that breaks events like police calls down by geographic category of some kind. But using mapping, with a simple zoom we can go from an overall view of patterns of activity to a much higher degree of detail that begins to tell some kind of story, or at least warrant further investigation.
But wait, there’s more…
By hovering over an individual square, I can get additional category information from my underlying data, assuming I have included it in my query. In this case there is a sub-category of the call:
By adjusting the query I can re-categorize my data to yield results by, for example, day of the week, or sub-category. For example, here we can contrast Possession of Marijuana (green) with Possession of Base/Rock Cocaine (pink):
Marijuana possession seems more diffuse, although concentrated in a few areas. The cocaine charges are much more concentrated.
In our next entry in this series, we’ll take a look at allocating data to shapes, to colour-code areas to represent different levels of activity.
Cognos report writers have long been frustrated by the poor built-in support for GIS-type displays in Cognos reporting tools. True, there is a basic map tool included as part of Report Studio, but it is quite limited in functionality. It can be used to colour geographic areas, but lacks layering, zooming, sophisticated selection tools, and the kind of detail we’ve all become used to with the advent of Google Maps and the like.
There are a few map-related add-ons for Cognos reporting available. Recently I had the opportunity to take Esri’s offering in this space for a test drive with a 2-day training session at Esri Canada’s Ottawa office. I came away impressed with the power and ease-of-use offered by this product.
EM4C – Esri Maps For Cognos – came out of development by SpotOn Systems, formerly of Ottawa, Canada. SpotOn was acquired by Esri in 2011. The current version of the product is 4.3.2. The product acts as a kind of plug-in to the Cognos portal environment, enabling Report Studio developers to embed Esri maps, served up by an Esri server, in conventional Report Studio reports. From a report developer perspective EM4C extends Report Studio, and does so from within the Cognos environment. This is important: EM4C users don’t have to use additional tools outside the Cognos portal. From an architectural perspective things are a little more complex: the Cognos environment must be augmented with EM4C server, gateway and dispatcher components that exist alongside the existing Cognos components.
Then, of course, there are the maps themselves. Since this is a tool to enable the use of Esri maps, an Esri GIS server must be available to serve the maps up to the report developer and ultimately the user. For shops that are already Esri GIS enabled this is not a challenge, and indeed I can see many users of this product wanting to buy it because they have a requirement to extend already available mapping technolgy into their BI shops. However, if you don’t have an Esri map server, don’t despair – the product comes with out-of-the-box access to a cloud-based map server provided as part of the licence for the product. This is a limited solution that won’t satisfy users who have, for example, their own shape files for their own custom maps, but on the other hand if you have such a requirement you probably already have a map-server in-house. If you are new to the world of GIS this solution is more than enough to get started.
So where do we start with EM4C? First, you need a report that contains data that has some geographic aspect to it. This can be as sophisticated as lat/long encoded data, or as simple as something like state names.
When we open our report, we notice we have a new tool: the Esri Map tool:
As mentioned, the EM4C experience is designed to enable the report writer to do everything from within Cognos. Using this tool we can embed a new map within out report:
So now what? We have a map place-holder, but no map. So the next step is to configure our map.
This step is done using Esri Maps Designer. This tool is installed in the Cognos environment as part of the EM4C install, and enables us to configure our map – or maps, as we can have multiple maps within a single report.
Esri Maps Designer is where we select the map layers we wish to display in our report. When we open it we can navigate to any Report Studio reports in which we have embedded and Esri map :
In this case VANTAGE_ESRI_1 is the name of the map in my report; the red X indicates it has not been configured yet. Clicking Configure brings up our configuration. This is where we select a Base Map, and then link our Cognos data to a layer to overlay on the map.
As mentioned, out-of-the-box the EM4C product enables the user to use maps served from the Esri cloud. We will select one of these maps from Esri Cloud Services as the Base Map of our report:
When the base map is embedded, it becomes a zoom-able, high-detail object within the report:
Unfortunately, while the map looks great it bears no relationship to the report data. So now what?
In part 2 of this overview we will look at how to connect the report data points to the report map. It is the combination of the ease-of-use of BI tools (and the data they can typically access) with mapping that makes a tool like EM4C so powerful. We will symbolize data to created colour-coded map-points to reveal the geographic location and spatial relation data, potentially allowing users to draw conclusions they otherwise would not have been able to with list-type data.
What are Cognos Active Reports?
Cognos Active Reports are a special type of Cognos report designed specifically for offline access. It is viewed in the Cognos portal or downloaded as an MHT file. This file can be moved anywhere and it will maintain its current data set much like a spreadsheet. Most Cognos Active Reports are designed to look, feel and act like dashboard reports.
How do I create a Cognos Active Report?
In Report Studio, by choosing the Active Report template at the start of your design. The design of an Active Report in Cognos Report Studio is somewhat different from a standard Report Studio report – there are new control objects and different variable controls. Running an Active Report will generate an MHT file, not the standard output you expect to see out of standard report (HTML, PDF, Excel etc.) You cannot easily change a non Active Report to an Active Report or vice versa.
What about security?
Once your MHT file is downloaded, Cognos security portal security is no longer applied. But this is also true of any other Cognos report output format like PDFs and spreadsheets. IBM has moved to alieviate this concern by allowing access codes to be set for Active Report files in Cognos 10.1.1.
Overall, Cognos Active Reports are intended for anyone who needs offline access to dashboard-type reports. It is yet another tool in the Cognos arsenal that can be called upon to meet specific business reporting needs.
I had the opportunity to hear Bill Inmon speak this week on a variety of subjects, including the proverbial divide in the business intelligence community between Inmon and Kimball data warehouse architectures. It was a very informative discussion, and even though I had read Inmon’s latest book I felt I came away with a better understanding of Inmon’s data warehouse philosophy and a better appreciation for what Inmon’s approach has to offer.
First and foremost, I found Inmon to be an engaging and down-to-earth speaker. I couldn’t help but notice that Inmon referred to Kimball by his first name, giving the sense that they are colleagues and perhaps even friends who respect each other’s work. He did say that some Kimballites he had spoken to over the years were very dogmatic and dismissive of his approach, but this did not seem to be a grudge he held personally against Ralph Kimball.
That said, Inmon highlighted the differences between the architectures first by pointing out the reason why each had been developed. Inmon noted that his architecture was designed to deal primarily with data integration across an enterprise, commonly referred to as the single version of the truth. Kimball’s architecture was designed to make reporting faster and easier, as indeed it does. Inmon pointed out that Kimball architecture tends to deliver a series of fact table-based data marts, joined by conformed dimensions to give a “data warehouse”. His approach is more holistic in the sense that an integrated data warehouse is built, and then data marts may follow.
A common misperception of Inmon’s architecture is that a data warehouse must be built in its entirety first. He said this is not so. An Inmon data warehouse can be built over time. He likened it to the growth of a city – you start out with certain districts and services and as the city grows the architecture of the city grows with it. You certainly don’t go out to build a complete city overnight; likewise with an enterprise data warehouse.
While it is true that Kimball’s approach is more practical and hands-on (and perhaps because of this many vendors have built data warehouse tools with Kimball architecture “baked-in”), Inmon did raise many valid and interesting points. His approach struck me as more enterprise-integration oriented as opposed to the almost ad-hoc nature of Kimball’s. I also found that I have followed some of Inmon’s approach without even realizing it – if you are archiving historical data, using an integrated staging area or enforcing “a single version of the truth”, you are to a certain degree following Inmon already. But of course if you have slowly-changing dimensions, star schemas and surrogate keys you are following Kimball too. Ultimately, Inmon said a hybrid approach is certainly a valid and viable option. Perhaps one day instead of seeing each as a competing architecture we will see each as a “tool set” we can draw upon, and the Kimball versus Inmon debate will finally be put to rest.
On October 1, 2012 legend/guru Bill Inmon spoke to the Ottawa data warehousing and BI community at an event organized by the local chapter of DAMA in conjunction with Coradix. Among other subjects, Mr. Inmon spoke at length on the idea of “Textual ETL”, a method for bringing semi-structured and unstructured data into the data warehouse, and making in available for analysis using conventional BI tools.
Mr. Inmon estimated that at least 80% of the data in an enterprise exists in this form – as emails, word documents, PDFs etc. – and he has spent almost a decade on the problem of organizing this data into a form that is queryable. The result is what he calls Textual ETL.
In essence this refers to a process for integrating the attributes of a text document (such as a contract) into a database structure that then enables query-based analysis. In the case of a contract, the document might contain certain key words that can be interpreted as significant, such as “Value” or “Royalties”. Rather than simply indexing the document, the Textual ETL process (which can contain over 160 different transformations) is designed to take unstructured documents and produce database tables that enable the user to create “SELECT”-style queries. In the case of a contract-type document, such queries might be to answer questions such as “find all the contracts that are of a value between X and Y that refer to product Z”.
A user with a system to manage such documents might have already added attributes such as “product” and “contract value” to the management system thus already enabling such queries, but the beauty of Textual ETL is that it enables the use of the application of taxonomies to documents to resolve the meanings of the texts themselves. This can extend to things like the resolution of things like synonyms. Mr. Inmon gave the example of texts (emails, for example) that refer to different brands of cars – Porche, Ford, and GM, say – or perhaps use the word “automobile”, but never use the word “car” explicitly. A well-designed textual ETL process would result in tables the allowed for ability to search for emails that refer to cars. It would do this by matching the brands of cars, or the word “automobile” to the word “car”, in effect appending “car” to the brands listed.
The process can be extended to dealing with documents where the same expression might mean very different things. Doctors may use similar, short expressions that mean different things depending on context. The application of Textual ETL to these kinds of documents would (must!) resolve these to different meanings.
The problems of implementing Textual ETL don’t seem trivial, and Mr. Inmon only presented a bare outline of how it is done. However, the implications for organizations that produce or deal with huge amounts of unstructured but critical texts – which is almost any organization of any size – could be considerable. In theory Textual ETL enables items that are thought of as not part of the normal domain of data warehousing to be brought into the data warehouse and subjected to the same kinds of analysis normally applied to such things as inventory levels, sales records and so forth.
While there has been much buzz and excitement around the concept of Big Data in recent years, I myself have always had certain reservations about it. Not one of my clients works in Big Data or has even expressed interest in it. My clients read structured data from well-established ERP vendor systems, occassionally stepping out of that zone when necessary. While I hear a lot about Hadoop, I don’t know anyone using it. Surely I am not the only one out there.
But Big Data takes a certain leap in logic, in scope, in imagination. I recently learned of a two day course offered by MIT that seeks to explain Big Data to corporate executives and business intelligence professionals alike. The very first thing it says on its website is that business people need to learn to look beyond their finance systems. Finance systems tell you what happened. Big Data can tell you what is happening out there now.
So there is certainly big potential in Big Data. It has the possibility of remaking the BI world as we know it. The MIT course, at $2,900 (tuition only), is not cheap but might help you get there a few steps ahead than everyone else.
Ventana Research CEO Mark Smith has an interesting blog post up with the subtle title “The Pathetic State of Dashboards”.
I’ve always been a bit of a dashboard skeptic. The fluff promoted by vendors (gauge-type displays for business metrics, for example) has always struck me as noisy and silly. A gauge-type display makes sense in a car, where second-by-second changes in pressure on the gas pedal create immediate changes in a gauge, that then feeds back to the pressure you apply (assuming you are paying attention) but there are few business requirements like this. Highlighting outliers is easily accomplished by conditional formatting. Using the “dashboard” as a metaphor – taking it from the real world of for example, a car, and mapping it to business activity – is an idea that in my experience doesn’t often stand up to scrutiny. The driver’s seat of a car is a different kind of place than the chair in a cubicle, and BI tools are generally too generic for the kind of moment-to-moment operational-level activity implied by dashboards.
Dashboards as an entry point to data discovery may make a certain amount of sense, but drill-through reporting has been around for a long time. Clear exception reports, the kind that can be created easily with out-of-the-box reporting software, are generally of far greater utility than the products of graphics-rich “dashboard” software.