Archive for category SQL Server
Last evening I attended a presentation on Data Quality Services (DQS) by Microsoft BI specialist Stéphane Fréchette. This new feature is available in SQL Server 2012, and is part of the the Enterprise Infomation Management stack of MS BI which includes DQS, Master Data Services (MDS) and the familiar Integration Services (SSIS).
First and foremost, DQS is intended as a business tool. DQS is designed for a data steward (typically a business user) who wants to define rules for cleansing data. It allows users to create a knowledge base of these rules, and it allows users to cleanse their data and to match duplicated data. This is a two step process that always starts with cleansing and then, if necessary, moves on to matching. DQS never overwrites data but instead writes data out to target destinations.
DQS’s strength is in the cleansing and matching, its two core functions. It does not appear to offer much in terms of data profiling, although it does allow you to review data as it is being cleansed and matched. SSIS is able to call DQS cleansing operations, but not matching operations. SSIS is able to deduplicate data through its own fuzzy matching feature or through the Master Data Services (MDS) operations.
DQS does have a few limitations. This first release is only able to read and write to Microsoft Excel and SQL Server 2012. Already noted, SSIS cannot make use of the matching operations. While it is an interesting foray into the realm of applied data governance, its heavy focus on cleansing makes it feel more tactical and less strategic. It will be an interesting addition to the overall SQL Server universe.
A client of mine recently ran into a problem with Cognos Data Manager using the newer SQL Server date/time format called datetime2. It is an extension of the original date/time format of SQL Server and is in all versions of SQL Server from 2008 and up. Datetime2 is incompatible with Cognos Data Manager, at least as it pertains to calculations. Data Manager will process datetime2 dates as long as they are not manipulated or used in calculations. However, if they are, you will need to cast them as the original datetime data type. Cognos Data Manager will otherwise treat them as an unrecognized data type and fail the Data Manager jobs.
This is a fairly easy workaround, but it is something to be careful about when you are using Cognos Data Manager and interfacing with SQL Server 2008, SQL Server 2008 R2 or SQL Server 2012.
I recently ran into an old problem working on an SSIS data package. It had to do with the management of inferred dimensions in a Slowly Changing Dimension (SCD) structure. When encountering this error, you will find that SSIS complains about your table column not being of type DT_BOOL (or Boolean). But when you check the table in question you see that your table column is defined correctly. So what is going on?
The problem has to do with the data set, not the table structure. SSIS is really complaining about null values in the table itself. Correct your data set and the problem will be resolved.
Last night I attended a presentation hosted by OttawaSQL.net and presented by Matt Masson on the newly released SQL Server 2012. I was principally interested in the new SQL Server service Data Quality Services (DQS), the latest addition to the SQL Server lineup.
Boiled down to its simplest form, DQS acts as a data “spell checker” that can apply statistical data correction, user-defined knowledge base data correction, or third party web service data correction. In the DQS interface, users define Term Based Relations (TBR) rules which can be applied against the data set. While correcting data, DQS will generate its own list of rules which you can validate.
Another relative newcomer to the SQL Server lineup is Master Data Services (MDS), which first appeared in SQL Server 2008 R2. This serves as a central repository of “golden” records, the single source of validated truth. It allows for a Master Data model to be generated and kept. This then serves as a lookup source for Integration Services packages.
SQL Server Integration Services (SSIS) is still the linchpin of data movement in the SQL Server environment. It integrates seamlessly with the DQS and MDS services. It has seen feature enhancements and improvements such as a catalog feature (for ease of configuration, security and management), change data capture, and built in reports for troubleshooting and logging.
Anyone interested in a deeper understanding of SQL Server 2012 should note that Microsoft will be hosting a free day-long workshop on June 22 at their Ottawa office.
SQL Server Analysis Services is a popular OLAP product included with Microsoft SQL Server. Especially since SQL Server 2005 this product has been quite powerful and fairly easy to develop with. SQL Server provides the Business Intelligence Development Studio (BIDS), a Visual Studio –like product to aid the development of Analysis Services cubes.
For browsing and reporting on a cube, however, choices have been more limited. Excel provides a good choice, especially since Excel 2007, which contains enhancements that make creating cross-tab reports easier than previous versions.
If your users are committed to Cognos PowerPlay, you can use this tool as well. Setting up a MS Analysis Services cube for browsing with PowerPlay is a little more involved than a regular Cognos cube, but is still quite easy to do.
The executeable is ppconnct.exe.
This tool is used to create a binary “pointer file” with a .MDC extension. This file, once created, will behave like a PowerPlay OLAP Cube, but the underlying cube will actually be (in this case) a Microsoft Analysis Services cube.
Start PowerPlay Connect, select File… New to create a new MDC file. For the database type, select MS SSOS (ODBO):
You have a couple of choices next. If you know the server name for your instance SQL Server Analysis Services you can enter it in the next line, under Server:[Port]. In this case I can enter “localhost”, as I am serving the cube from my local machine.
Alternatively, I can select the … button beside Database, and I will be presented with the Chose a Remote Cube dialog box. In this case I then select Microsoft SQL Server OLAP Server at the bottom, and then select a connection I have already created previously using the tool. In this case the connection is called local. I’m then presented with a list of databases available on the connection “local”.
I can then open SSAS_Adventure_Works and the cube that exists in this particular database. A database might have many cubes available in it.
Alternatively I could create a new connection, by clicking on Connections… and then clicking Add. I enter the name I want to give the connection, and then the name of the server, and select Microsoft SQL Server OLAP Server and MSOLAP as the provider:
Since I selected the cube SSAS_Adventure_Works, we see this in the details of the connection string:
I can now click File… Save and save this as an .MDC file:
The file appears as a normal MDC cube, but is really just a pointer file to the SSAS database server:
Using PowerPlay, I can now open the MDC file as if it were an ordinary cube. I can navigate it generally in the same way I would navigate a Cognos cube, althought some things such as Measure Groups that are part of the Microsoft approach to OLAP do not behave exactly the same way. Meaures appear as a single list, much as they do in Cognos cubes.
PowerPlay Connect MDC files can be put on the network, or shared as any other file, and will work as long as the user has access to the underlying Microsoft database.
Often I am asked to quickly diagnose problems in Cognos Report Studio reports or Microsoft SQL Server Integration Packages. A trick that works equally well for both is to pull the XML code of the BI object. This works especially well if you are trying to debug behaviour of certain variables or settings and are having trouble following the code within the development environment. Object settings can be obscure or difficult to find in a GUI development tool but are often plain as day when you see them listed in XML code.
To pull XML code out of a Cognos Report Studio report, simply click “Copy Report to Clipboard” from the “Tools” menu. Paste the results into a text editor.
To pull XML code out of a Microsoft SQL Server Integration Services package, simply open the .dtsx file with a text editor.
This works best for exploratory purposes and if you are looking for specific objects in the XML code (which can be easily searched). I do not recommend editing XML code directly unless you are extremely careful and know exactly what you are doing – it is very easy to corrupt a file this way making it unreadable. Once the problem is diagnosed and understood, changes can be made in the development tools. But this method of code exploration has saved me much time in problem diagnosis.
I came away from Ottawa Code Camp with an interesting tidbit – the XML data storage capabilities of SQL Server. Although this feature has existed since SQL Server 2005, this was the first time I had actually seen it demonstrated. As virtually the entire Cognos 8 world is XML driven, this has some interesting possibilities. The XML data type can force incoming data to fit its defined XML schema in much the same way a table structure does in a relational database, but it also will store any XML data when no schema is defined. Data can be extracted from the XML data type as pure data or raw XML through the use of XML Query (known as XQuery) and it can be manipulated with XML Data Manipulation Language (XML-DML).
Is anyone out there using this feature or is this a little known extra hiding away under SQL Server’s many other features?
I attended last night’s meeting of the Ottawa SQL.net Professional Association where we received a presentation of Microsoft SQL Server 2008 R2 by Damir Bersinic. R2 is SQL Server’s forthcoming release which should be available in May 2010.
When I got home, my wife asked me “Why didn’t they just call it SQL Server 2010?” Typically a version indicator like this marks a minor release, but it does sound a little strange. It reminds me of a story told by a fellow BI consultant a few years ago in which he was grilled by a U.S. customs agent for traveling to the United States for SQL Server 2005 training (“But it’s 2007!” the agent protested).
Despite the odd moniker, this release does have some interesting features particularly for security and Business Intelligence. Here are some of the highlights I noted:
Security: Building on the ability to encrypt columns in SQL Server 2008, it is possible with R2 to encrypt the database mdb and log files with what is referred to as “transparent data encryption”. Specifically, this is to prevent anyone walking off your business site with a USB key loaded with a copy of your entire database.
BI: With an emphasis on self service features, the new BI enhancements allow business users to bring large scale data sets directly into Excel with a tool called PowerPivot for Excel. Using in-memory storage, this allows you to manipulate millions of rows in Excel very quickly and easily.
SQL Server 2008 R2 Parallel Data Warehouse Edition: A SQL Server edition specifically designed for extremely large scale data warehouses, to hold tens or hundreds of terabits of data.
Efficiency: SQL Server 2008 R2 has further improved page compression and data compression, allowing more data per page. This reduces the number of reads and writes required, making database performance faster.
Master Data Services: A SharePoint based application that manages business definitions and rules with respect to your data. This is intended to simplify the management of these definitions, especially as they change over time.
Dashboards: There are now customizable dashboards for both DBAs and report administrators to monitor performance, resource utilization and other environment statistics.
The word is out that SharePoint will be the standard platform of the future Microsoft Business Intelligence. A SharePoint breakfast session will take place on April 22 in Ottawa, as well a number of other Canadian cities between April 19th and May 3rd.
It appears that in-memory storage/processing is becoming the new standard for Business Intelligence, as both IBM Cognos TM1 and Microsoft SQL Server PowerPivot for Excel are using it to great effect.