Archive for May, 2012
Boolean Errors in SSIS
Posted by Scott Andrews in ETL, SQL Server on May 22, 2012
I recently ran into an old problem working on an SSIS data package. It had to do with the management of inferred dimensions in a Slowly Changing Dimension (SCD) structure. When encountering this error, you will find that SSIS complains about your table column not being of type DT_BOOL (or Boolean). But when you check the table in question you see that your table column is defined correctly. So what is going on?
The problem has to do with the data set, not the table structure. SSIS is really complaining about null values in the table itself. Correct your data set and the problem will be resolved.
What happened to Cognos Scheduler?
Posted by Scott Andrews in Cognos, ETL on May 16, 2012
A client of mine recently upgraded from Cognos Series 7 to Cognos 10. With it, the Cognos DecisionStream catalog was brought up to Cognos Data Manager. However, when my client went to schedule the Cognos Data Manager jobs she noticed that Cognos Scheduler is not part of the Cognos 10 environment.
Cognos Scheduler was an extra tool you could use to schedule a variety of Cognos 7 related tasks, including DecisionStream jobs. While a handy feature, I found that virtually all of my past clients never used it. Usually there was a corporate scheduling tool or the operating system’s scheduling feature would be used instead. So this was never an issue for one of my clients until now.
With Cognos Scheduler now retired and scheduling Cognos reports simply a function of reports on the portal, there is still a service in the Cognos 10 environment that allows you to run Cognos Data Manager jobs through IBM Cognos Connection. This is the Data Movement Service. Cognos Data Manager must be installed on your Cognos 10 server for this to work. You publish your jobs to schedule from Cognos Data Manager to the Cognos portal.
That said, most of my clients still use Windows scheduler or Unix cron jobs to schedule events like this.
Introducing Microsoft SQL Server 2012
Posted by Scott Andrews in Business Intelligence, ETL, Ottawa Events, SQL Server on May 11, 2012
Last night I attended a presentation hosted by OttawaSQL.net and presented by Matt Masson on the newly released SQL Server 2012. I was principally interested in the new SQL Server service Data Quality Services (DQS), the latest addition to the SQL Server lineup.
Boiled down to its simplest form, DQS acts as a data “spell checker” that can apply statistical data correction, user-defined knowledge base data correction, or third party web service data correction. In the DQS interface, users define Term Based Relations (TBR) rules which can be applied against the data set. While correcting data, DQS will generate its own list of rules which you can validate.
Another relative newcomer to the SQL Server lineup is Master Data Services (MDS), which first appeared in SQL Server 2008 R2. This serves as a central repository of “golden” records, the single source of validated truth. It allows for a Master Data model to be generated and kept. This then serves as a lookup source for Integration Services packages.
SQL Server Integration Services (SSIS) is still the linchpin of data movement in the SQL Server environment. It integrates seamlessly with the DQS and MDS services. It has seen feature enhancements and improvements such as a catalog feature (for ease of configuration, security and management), change data capture, and built in reports for troubleshooting and logging.
Anyone interested in a deeper understanding of SQL Server 2012 should note that Microsoft will be hosting a free day-long workshop on June 22 at their Ottawa office.
But Do You Drive a Ford?
Posted by Scott Andrews in Client Relations, ETL on May 3, 2012
As an ETL developer, I cut my teeth on Microsoft Data Transformation Services (DTS) in 1999. This experience led to engagements in Cognos DecisionStream, and later Microsoft SQL Server Integration Services (SSIS), Cognos Data Manager and more recently IBM InfoSphere DataStage. I also have experience in Oracle SQL Loader and Unix scripting for data loading purposes. In all, I have been working in ETL for more than 12 years.
But never mind all that. When it comes to contract opportunities, tools rule.
If truck drivers were hired the way ETL developers are, they would have ads like this:
Wanted: Truck Driver – must have 5 years experience driving a Ford
Most ads read X number of years in Y tool, usually at least 2 to 5 years. When you are new to a tool, even with a decade of experience in others, you can be dropped from consideration. This approach is very limiting, both to me as a contractor and to clients who are hiring. Their pool of applicants is limited, and they eliminate an excellent resource like me!
Being a self-taught computer techie in Delphi, VB.Net, Crystal Reports, Gupta databases and to a certain extent SQL Server itself (I am formally trained in Cognos and Oracle), I am capable of ramping up on new products and technologies and delivering professional-grade, quality results in a short amount of time. But I often find myself pigeon-holed into a small group of tools, and recent experience in new products is often heavily discounted if not written off completely.
ETL development itself is very conceptual. It is more than the tool, the database, or even how the tool and database work together. I recently reduced the runtime of an ETL process by 80% by using database techniques that were completely unrelated to the ETL tool in question. This technique could have worked with almost any database or ETL tool. As such ETL solutions are not as tool driven as managers or procurement might have you believe.
This is not to say that ETL tools do not have differences, sometimes even major ones. But at the end of the day, like a truck, they get your data from point A to point B. And just like any quality, experienced truck driver, ETL developers should be able to change their vehicle of choice and drive it home.