09 November 2011

No id, please

When I was 16 I got my first id card, that’s obviously a milestone in every person’s life. Not even using it for much (maybe for seeing M rated movies) it means that I have “id”. The second time I got an id was after I immigrated to Australia; as an immigrant to show Australian id is a real milestone.
In the SQL world - databases vs. data warehouses - we have two process. Learning databases you learn that normalization involves that each table has a unique key, and that every row in each table can be identified by a unique key. The first normal form requires every row to have a unique key. This unique key can be an “id” - identifier, or a composite key.
For many years I’ve been working with MS-Access. In Ms-Access, the best practice would be to have the unique key as an Identifier and not as a composite key; that’s due to the fact that if you want to link from one database to another it links the tables better. It’s usually best practice to split the access application to back end (data) and front end (user interface)  in two different mdb files, so when you try to change network location (or migrate the date to sql server) and you need to change “in bulk” the location of the referring database, if you have a field that’s defined as an “id” it would be done quicker and more accurately than it would if you have a composite key.
In data warehouses’ design things are different. While designing a dimension every row must have a unique key which is  a foreign key in the fact table. This unique key better be truly unique: not only it should be different than the “name” or the natural key of the external system, it should also be acting as a surrogate key, in case we’re going to change the “external system” that we’re working with. But apparently, a fact table doesn’t have to have a key! A unique composite key is “good enough”, and that's because we're dealing mainly with measures, and some people claim that even that isn't really necessary. Think "Sum" and then "group by"; no, we don't need Id as identity over there, do we? Moreover, I’ve found out that using a composite key gives me the ability to minimize the data and to warn me whenever I’m doing an error, like inserting the data with the wrong date. (I don’t disable the indices while inserting; I’d rather lose half a minute waiting for data to be inserted, than lose a day’s work thinking what went wrong with the job I’ve done. For my size of database it is “a good enough practice”).
So far I’ve found out that’s personally, this is the most difficult part of “denormalization”. Another thing which I needed to get my head around is that actually, when we’re discussing Hierarchies in SSAS, we don’t really need the “id” there, especially not for Slowly Changing Dimensions. Let’s take “employee” as an example: I’d group by the key of “employee id” and add a line for a change in the name, for example. As I group by the “employee id” I would like to see all that’s relating to a certain employee no matter if they got married/divorced in October and therefore changed their name... Keeping the “id” as an attribute, but outside of the “Hierarchy”, enables me to browse the Hierarchy correctly without seeing repeating lines of Employee Id where I shouldn’t.

03 November 2011

Adding parameters to my MDX query

 In my last post I've discussed a recursive MDX query. In this post I'll describe how I've parametrized the date for this query.
As you remember, in that query there was a set defined like this:
SET [OrderedFUM] AS (
    FILTER(   
    ORDER (([Dim Portfolio Client].[Dim Client Group Id].children,
        [Dim Date].[Id].&[20110630]),
        [Measures].[FUM Value], BASC),
        ISEMPTY ([Measures].[FUM Value] )=0)
       
    )     



In real life, of course, I don't want to make a specific date; I want to add the date as a parameter to my MDX query. The requirement is for "sum per client per month" as the data is stored on a monthly basis. After lots of mocking around I've done the following, and I recommend following the same procedure for anyone else who is trying to parametrize an MDX query in SSRS:
  1. Create a new report in BIDS.
  2. Define a data connection.
  3. Copy the MDX query "as is" into the Query Designer (the DataSet part)
  4. Define a parameter in the "parameters" tab
In order to achieve that I have to use SSRS: After I've created the Data Source and the Data Set and added the MDX "as is" into the Query Designer, I've created the report and formatted it to my liking. Then I've added the parameter in the "parameters" section. I called it "Month" and made it of data type text.
Now I need to add the "date" in the SET as a parameter, not a constant. Therefor, I had to change the defintion of the measure "OrderFum" set as follows:
SET [OrderedFUM] AS (
    FILTER(   
    ORDER (([Dim Portfolio Client].[Dim Client Group Id].children,
        STRTOMEMBER(@month)),        [Measures].[FUM Value], BASC),
        [Measures].[FUM Value]<>0)
    )


And in order to make the code more readable (=less references to parameters), I changed the other reference to "dim date" in the RANK function from

[Dim Date].[Id].&[20110630]

To


[Dim Date].[Id].currentmember.


And now in the "parameter" box I need to insert the following very clear code
[dim date].[id].&[this is where the day of the month actually gets in - the actual parameter]
For Example:
[dim date].[id].&[20110930]
You would expect that I would use the "@month" parameter differently, and chances are it could be true.While unfortunately there are changes in using StrToMember function between Sql Server 2008 R2 to Sql Server 2005, the StrToMember Functions are quite complicated, as Mosha clearly explains. In order to make this code a bit more usable, I just added [Dim Date].[id].&[20110930] as a default parameter; the beauty is, that if my business analyst wants to play around with this, he just needs to change the date, not the entire text in the text box. Of course it is prone to human error; I, for once, forgot that it is [Dim Date] and not [DimDate].... And yes, I've tried StrToValue but to no avail.
For conclusion: definitely not intuitive, get ready to bang your head in the wall several times during using parametrised MDX with SSRS.