Archive for April, 2012
The Gartner Magic Quadrant for BI is a good place to start when looking at the rich field of players in the BI space. The usual suspects are always there – IBM, Oracle, Microsoft, Microstrategy etc. but is is always interesting to look at tools that are less well-known, or fall outside the upper-right quadrant that everyone seems to aspire to (and judge products by – as a side note I’m thinking about “The Tyranny of The Upper Right Quadrant” as a subject for a future post.)
Tableau is an interesting OLAP-type analytical product that that falls in the upper-left quadrant, qualifying it as a “challenger” in Gartner-speak. But those that like it like it a lot – Gartner goes on to describe it as “The ‘sweetheart’ of the quadrant.” Apparently customers love this product.
I took a quick look at the desktop version of the software, which is offered as a fully-functional 14 day trial.
Tableau has an interesting history. It was started by folks with a strong interest in data visualization. From the beginning Tableau was positioned as a tool that would enable fast visual representations of data (original founders included a founding member of Pixar.) Tableau advertises itself as “a stunning alternative to traditional business intelligence”, attempting to carve out a niche in an area that Cognos, for example, has traditionally not been great at (in my opinion visualization has always been clumsy in tools even as advanced as Report Studio.)
Another area Tableau claims to excel in is in raw speed – “Bring your data into Tableau’s high performance data engine and work with it at blazing speed. And do it with a click—there’s no programming required. Tableau turns millions of rows of data into answers at the speed of thought.” goes the sales pitch. No “programming” required, but definitely some thinking.
When you start doing analysis with Tableau you are offered the ability to connect to a wide, impressive range of data sources. These include Excel, the usual commercial databases etc, but also open-source favourites MySQL and PostgreSQL. As well, there is an option to connect to Cloudera HADOOP Hive. Tableau is plainly positioning itself for “Big Data”-type analysis.
When you select a relational-type data source, such as Microsoft SQL Server, you have the option to select one or more tables, and establish their joins using a series of dialog boxes. From a data-modelling perspective this interface feels a bit awkward, but it gets the job done at the desktop-level…
… and with clearly-defined keys and a simple data model this shouldn’t present data-savy users with much of a problem – more on this below.
This is where it gets interesting. I created a MS SQL Server database consisting of 10000 customers, 50 products, and 100 million sales rows - a very simple model, but a large overall size for my hamster-powered laptop. I then created a MS Analysis Services cube to play with. However, working from the relational model, a user can connect to the database and importing this directly Tableau’s native format – according to Gartner a column-oriented in-memory data engine. On my admittedly underpowered laptop this took a couple of hours, but performance when querying the imported data was quite impressive – it seemed at least as fast as the Analysis Services cube. This isn’t sophisticated benchmarking, but indicates that Tableau’s engine definitely has some power. Using this feature assumes that the user is comfortable arranging the hierarchies of the data themselves, instead of having a modeller do it for them in a cube.
This approach reveals something about critical about Tableau’s market – this tool is meant for people who are comfortable with the world of databases and OLAP-style structures, and for whom creating joins, hierarchies and all the rest is a natural part of the way they think about the data – but who are also the very people interested in analyzing their data. The database, the joins, the model – all of this is a means to an end, carried out, at least to some degree, by the analysts themselves. This hints at Wayne Eckerson’s observation that real analysis is often a bottom-up process, with savvy folks in the business using the powerful tools now available to them to “end run” the IT department. This tool essentially builds-in a kind of ETL between a database and a proprietary analytical structure. This isn’t mandatory, of course, and connecting to my Analysis Services cube was quite easy and natural, but this is something to think about.
As expected, visualizations are where Tableau excels. The “Show Me” tab gives the user a number of visualization options, with hints as to what is appropriate for what kind of data.
Many of the visualizations available are quite useful – for example, below I am able to visually locate a customer who is “Tier 1”, but has very low sales. Arranging this display tool seconds:
Tableau offers the user the ability to connect simultaneously to multiple data sources. Here I have 2 data sources in the “Data” tab. Contrast this with the approach Cognos takes, where multiple data sources are put together in a package that hides this from the user. Once again, the idea is that the user knows the data (and how it relates) well enough to perform these kinds of tasks – but the user can act quickly to select the data sources they want and combine them as he or she sees fit.
Digging into all of Tableau’s features is beyond the scope of this post, but this is definitely a thought-provoking product. The BI world seems to be in a never-ending struggle between quick, user-oriented tools and the more controlled, but less agile, enterprise-grade BI suites. Tableau seems to be positioning itself as a product for the highly competent analyst in a relatively small organization – or a small part of a large organization. Gartner provides some insight here: “Tableau’s products often fill an unmet need in organizations that already have a BI standard, and are frequently deployed as a complementary capability to an existing BI platform. Tableau is still less likely to be considered an enterprise BI standard than the products of most other vendors.” Tableau is not a general-purpose reporting tool – it is an analysis tool, for analysts.
Have you ever had a Cognos Transformer exported model (mdl) refuse to open because of unresolved duplicate orphans or some similar problem?
You can fix this yourself by manually editing the mdl file with a powerful text editor (Notepad won’t cut it, obviously). Once inside the mdl, you will be able to browse the categories of your cube and remove any that are giving you problems.
What you will find in the file is something like this:
Category 1535687 “K1T 4H” Parent 1535685 Levels 1535311 OrderBy Drill 1535277
Value “K1T 4H” Lastuse 20071218 SourceValue “K1T 4H” Filtered False Suppressed False
Sign False HideValue False IsKeyOrphanage False IsTruncated False Blanks False
Category 1535689 “K1T 4H2~2363″ Parent 1535687 Levels 1535405 Lastuse 20071218
SourceValue “K1T 4H2″ Filtered False Suppressed False Sign False HideValue False
IsKeyOrphanage False IsTruncated False Blanks False
Category 1535447 “1000061″ Parent 1535689 Levels 1535409 Lastuse 20071218
SourceValue “1000061″ Filtered False Suppressed False Sign False HideValue False
IsKeyOrphanage False IsTruncated False Blanks False
You can remove problem category records in their entirety. Once your problem records are removed, Cognos Transformer should be able to open the model normally.
Warning: Do take care editing Cognos model files in this manner. Please ensure you maintain backups of your files in case your manual fix goes awry.
I recently ran into a puzzling problem with a Cognos Unix installation. Even though the Cognos portal was running and fully functional, Cognos Configuration reported that the service was not running. As such, it could not be stopped through Cognos Configuration. Likewise, the command line cogconfig.sh -stop also had no effect. It was a runaway process with no control available through normal Cognos channels. If we shut down the Cognos process manually, then Cognos Configuration would not start up properly. At the very end of the process, it would report a failure – even though the Cognos portal was up and fully functional again.
After digging through the logs, I discovered that the key to this problem was the Cognos Bootstrap Service (CBS). The Cognos Bootstrap Service makes sure that the main Java process is running by monitoring the shutdown port.
The log file cbs_run.log reported:
CBSBootstrapService Process ID could not be written to file. The cogbootstrapservice -stop command will not work. Use ‘kill -9′ instead.
The log file cbs_isrunning.log reported:
CBSBootstrapService Process is not running.
So if the Cognos Bootstrap Service doesn’t run at all, then Cognos Configuration will not know if Cognos is running or not.
The file that was preventing our Cognos Bootstrap Service from starting was cogbootstrap_service.pid which resides in the c10\logs directory. Our Unix account did not have rights to write to this file, which was the cause of the problem.
Once the Cognos Bootstrap Service was able to run again, our Cognos startup/shutdown issue was resolved.