Showing posts with label changing. Show all posts
Showing posts with label changing. Show all posts

Monday, March 26, 2012

Parent ID in a Slowly Changing Dimension

Hi There,

Just wondering if any of you implemented a (Kimball type 2) dimension structure, in which a ParentID column exists which points to a record from the same dimension table, using a SCD objects in SSIS. The ParentID column would have to be "Historical".

The challange here is that you would need to go through the table twice somehow, because if I would do a lookup of the parent record in the first run, I wouldn't be sure if I got the right parent record.

Thnx, Jeroen.

Jeroen Alblas wrote:

Hi There,

Just wondering if any of you implemented a (Kimball type 2) dimension structure, in which a ParentID column exists which points to a record from the same dimension table, using a SCD objects in SSIS. The ParentID column would have to be "Historical".

The challange here is that you would need to go through the table twice somehow, because if I would do a lookup of the parent record in the first run, I wouldn't be sure if I got the right parent record.

Thnx, Jeroen.

I don't understand the problem or question. I think if you define ParentID as 'historical' the SCD component will issue a new 'version' of the rows every time it detects a change in parentID value and it will 'expire' the previous 'version'. It’s just simple like that. It may be helpful is you provide an example of what you are trying to accomplish

|||

Why would your Id column change if you have a type2 change?

Are you using a typ2 column to lookup your parentID?

|||

Okay, let me try to give an example:

Lets say we have an employee named Doopie and his boss named Barkie, both in the dim_Employee table. In our case we work with start- and enddates.

ID Name ParentID Salary Stardate EndDate
10 Doopie 20 1000 1/1/1900 12/31/9999
20 Barkie NULL 4000 1/1/1900 12/31/9999

Okay, now, in the source system, Barkie gives himself a raise of 500. We want to track history of Salary. Now say we start the Data flow task with the Slowly changing dimension object. It could very well be, that Doopie's record is processed first. No changes there so Doopie's record remains unchanged. Now Barkie record which introduces a new row:

ID Name ParentID Salary Stardate EndDate
10 Doopie 20 1000 1/1/1900 31/12/9999
20 Barkie NULL 4000 1/1/1900 12/7/2006
21 Barkie NULL 4500 12/8/2006 31/12/9999

If Doopie's record was processed after the insert of Barkie's record, the lookup would return record 21 as the active Boss record, and introduce a new record for Doopie as well...

Thnx again!
Jeroen

|||

I see the problem now. Basically you need to make sure every child record point to the latest version of its parent.

Well, I don't see an easy way to do this at one pass. You may try to do an update after the dataflow with the SCD2 finish that assures every row is pointing to the latest version row of its parent. That could be safer in the case that an unchanged child row is not in your set of rows to be processed (assuming you are pulling only changed/new rows from the source). That is my best shot considering that I have not drunk my first coffee this morning

|||Why don't you normalize that table?|||

Phil wrote:

Why don't you normalize that table?

Hi Phil, I normalized some columns of it (name of parent), but the parent / child hierarchy goes about six or seven levels. So then I would have to add quite a few columns...

Rafael wrote:

You may try to do an update after the dataflow with the SCD2 finish that assures every row is pointing to the latest version row of its parent.

I can't just do an update, because the ParentID is defined as type 2 (historical) and should result in a new row for the child record as well...

Jeroen

|||

Jeroen Alblas wrote:

I can't just do an update, because the ParentID is defined as type 2 (historical) and should result in a new row for the child record as well...

So let's define 'historical'. Defining a column as 'historical' to me is intended to track changes on a particular attribute. In the example you gave at the beginning you said that no changes were made to the child row:

“No changes there so Doopie's record remains unchanged.”

So, why do you want to issue a new version of the child row? The only change that occurred was to the parent; hence you should generate a new row to the parent, expire the existing (parent)one and update all its children to point to new parent ID. After all, there is not change to track when you look at the child record.

In your example, I would issue a new row for Doopie only if I detected that Barkie is not anymore his manager; that way I would be tracking historical changes of the managers.

Under your approach, have you figured out what would happened if a change occurs to the topmost manager? You would have to generate a new row for the whole organization…kind of messy to me. BTW, the only way I see you could implement this is through recursive queries…if you are using SQL Server 2005 you could use Common table expressions for that.

|||

Indeed no changes are made to the attributes of the child record, unless you consider the ParentID column an attribute of the child record. And updating it will solve it for the current point in time, but as you can see in the example, yit would seem Doopie's parent would always have had a salary of 4500...

But I agree; if the topmost record would change I would get the whole tree inserted again ... Indeed that's not what I want.

So, I guess I will add the parent columns which must lead to a new child record (Firstname, Lastname, Login, etc) to the child record and update the ParentID to point to the active record of the parent.

Still, I think I must conclude that the dimensional modelling technique, at least to my knowledge, does not provide a good method of combining "tracking history" in combination with "recursive relationships".

Thnx, Jeroen

|||

The way I usually handle this is by using a business Key to uniquely identify each person (see EmployeeID)
If you have a single source for your employees, you can use that id, otherwise you have to generate one. Then, you
use this key for the relationship like so

ID EmployeeID Name ParentID Salary Stardate EndDate
10 100 Doopie 200 1000 1/1/1900 12/31/9999
20 200 Barkie NULL 4000 1/1/1900 12/31/9999

After the change the data would look like this:

ID EmployeeID Name ParentID Salary Stardate EndDate
10 100 Doopie 200 1000 1/1/1900 31/12/9999
20 200 Barkie NULL 4000 1/1/1900 12/7/2006
21 200 Barkie NULL 4500 12/8/2006 31/12/9999

|||

Jeroen Alblas wrote:

Indeed no changes are made to the attributes of the child record, unless you consider the ParentID column an attribute of the child record.

Indeed, I would not consider it an attribute of the child row.

Jeroen Alblas wrote:

And updating it will solve it for the current point in time, but as you can see in the example, yit would seem Doopie's parent would always have had a salary of 4500...

I disagree. It does not seem like Doopie's parent always have had a salary of 4500; what it actually seems is the current salary of her parent is 4500. You should not try to get historical information of a row through its children...I gues that should be part of educating end users on what a SCD 2 is and what is not.

Jeroen Alblas wrote:

But I agree; if the topmost record would change I would get the whole tree inserted again ... Indeed that's not what I want.

That is certanly not practical...

Jeroen Alblas wrote:

So, I guess I will add the parent columns which must lead to a new child record (Firstname, Lastname, Login, etc) to the child record and update the ParentID to point to the active record of the parent.

That was pretty much my original sugestion...good luck with that

|||

David Frommer wrote:

The way I usually handle this is by using a business Key to uniquely identify each person (see EmployeeID)

Thnx David. I do have the business key of the employee in the dimension. However, if you create a parent-child dimension in Analysis Services, you need a key which identifies a unique row in the dimension table. I didn't mention this before, but this is one of the things for which I use the column.

Jeroen

|||

Rafael Salas wrote:

You should not try to get historical information of a row through its children...I gues that should be part of educating end users on what a SCD 2 is and what is not.

Although I'm still convinced the ParentID is not quite as useful if it doesn't give the same result as joining using the business key in combination with start- and enddate, I think this a quite a good conclusion to wrap this up :) thnx.

Jeroen

Parent ID in a Slowly Changing Dimension

Hi There,

Just wondering if any of you implemented a (Kimball type 2) dimension structure, in which a ParentID column exists which points to a record from the same dimension table, using a SCD objects in SSIS. The ParentID column would have to be "Historical".

The challange here is that you would need to go through the table twice somehow, because if I would do a lookup of the parent record in the first run, I wouldn't be sure if I got the right parent record.

Thnx, Jeroen.

Jeroen Alblas wrote:

Hi There,

Just wondering if any of you implemented a (Kimball type 2) dimension structure, in which a ParentID column exists which points to a record from the same dimension table, using a SCD objects in SSIS. The ParentID column would have to be "Historical".

The challange here is that you would need to go through the table twice somehow, because if I would do a lookup of the parent record in the first run, I wouldn't be sure if I got the right parent record.

Thnx, Jeroen.

I don't understand the problem or question. I think if you define ParentID as 'historical' the SCD component will issue a new 'version' of the rows every time it detects a change in parentID value and it will 'expire' the previous 'version'. It’s just simple like that. It may be helpful is you provide an example of what you are trying to accomplish

|||

Why would your Id column change if you have a type2 change?

Are you using a typ2 column to lookup your parentID?

|||

Okay, let me try to give an example:

Lets say we have an employee named Doopie and his boss named Barkie, both in the dim_Employee table. In our case we work with start- and enddates.

ID Name ParentID Salary Stardate EndDate
10 Doopie 20 1000 1/1/1900 12/31/9999
20 Barkie NULL 4000 1/1/1900 12/31/9999

Okay, now, in the source system, Barkie gives himself a raise of 500. We want to track history of Salary. Now say we start the Data flow task with the Slowly changing dimension object. It could very well be, that Doopie's record is processed first. No changes there so Doopie's record remains unchanged. Now Barkie record which introduces a new row:

ID Name ParentID Salary Stardate EndDate
10 Doopie 20 1000 1/1/1900 31/12/9999
20 Barkie NULL 4000 1/1/1900 12/7/2006
21 Barkie NULL 4500 12/8/2006 31/12/9999

If Doopie's record was processed after the insert of Barkie's record, the lookup would return record 21 as the active Boss record, and introduce a new record for Doopie as well...

Thnx again!
Jeroen

|||

I see the problem now. Basically you need to make sure every child record point to the latest version of its parent.

Well, I don't see an easy way to do this at one pass. You may try to do an update after the dataflow with the SCD2 finish that assures every row is pointing to the latest version row of its parent. That could be safer in the case that an unchanged child row is not in your set of rows to be processed (assuming you are pulling only changed/new rows from the source). That is my best shot considering that I have not drunk my first coffee this morning

|||Why don't you normalize that table?|||

Phil wrote:

Why don't you normalize that table?

Hi Phil, I normalized some columns of it (name of parent), but the parent / child hierarchy goes about six or seven levels. So then I would have to add quite a few columns...

Rafael wrote:

You may try to do an update after the dataflow with the SCD2 finish that assures every row is pointing to the latest version row of its parent.

I can't just do an update, because the ParentID is defined as type 2 (historical) and should result in a new row for the child record as well...

Jeroen

|||

Jeroen Alblas wrote:

I can't just do an update, because the ParentID is defined as type 2 (historical) and should result in a new row for the child record as well...

So let's define 'historical'. Defining a column as 'historical' to me is intended to track changes on a particular attribute. In the example you gave at the beginning you said that no changes were made to the child row:

“No changes there so Doopie's record remains unchanged.”

So, why do you want to issue a new version of the child row? The only change that occurred was to the parent; hence you should generate a new row to the parent, expire the existing (parent)one and update all its children to point to new parent ID. After all, there is not change to track when you look at the child record.

In your example, I would issue a new row for Doopie only if I detected that Barkie is not anymore his manager; that way I would be tracking historical changes of the managers.

Under your approach, have you figured out what would happened if a change occurs to the topmost manager? You would have to generate a new row for the whole organization…kind of messy to me. BTW, the only way I see you could implement this is through recursive queries…if you are using SQL Server 2005 you could use Common table expressions for that.

|||

Indeed no changes are made to the attributes of the child record, unless you consider the ParentID column an attribute of the child record. And updating it will solve it for the current point in time, but as you can see in the example, yit would seem Doopie's parent would always have had a salary of 4500...

But I agree; if the topmost record would change I would get the whole tree inserted again ... Indeed that's not what I want.

So, I guess I will add the parent columns which must lead to a new child record (Firstname, Lastname, Login, etc) to the child record and update the ParentID to point to the active record of the parent.

Still, I think I must conclude that the dimensional modelling technique, at least to my knowledge, does not provide a good method of combining "tracking history" in combination with "recursive relationships".

Thnx, Jeroen

|||

The way I usually handle this is by using a business Key to uniquely identify each person (see EmployeeID)
If you have a single source for your employees, you can use that id, otherwise you have to generate one. Then, you
use this key for the relationship like so

ID EmployeeID Name ParentID Salary Stardate EndDate
10 100 Doopie 200 1000 1/1/1900 12/31/9999
20 200 Barkie NULL 4000 1/1/1900 12/31/9999

After the change the data would look like this:

ID EmployeeID Name ParentID Salary Stardate EndDate
10 100 Doopie 200 1000 1/1/1900 31/12/9999
20 200 Barkie NULL 4000 1/1/1900 12/7/2006
21 200 Barkie NULL 4500 12/8/2006 31/12/9999

|||

Jeroen Alblas wrote:

Indeed no changes are made to the attributes of the child record, unless you consider the ParentID column an attribute of the child record.

Indeed, I would not consider it an attribute of the child row.

Jeroen Alblas wrote:

And updating it will solve it for the current point in time, but as you can see in the example, yit would seem Doopie's parent would always have had a salary of 4500...

I disagree. It does not seem like Doopie's parent always have had a salary of 4500; what it actually seems is the current salary of her parent is 4500. You should not try to get historical information of a row through its children...I gues that should be part of educating end users on what a SCD 2 is and what is not.

Jeroen Alblas wrote:

But I agree; if the topmost record would change I would get the whole tree inserted again ... Indeed that's not what I want.

That is certanly not practical...

Jeroen Alblas wrote:

So, I guess I will add the parent columns which must lead to a new child record (Firstname, Lastname, Login, etc) to the child record and update the ParentID to point to the active record of the parent.

That was pretty much my original sugestion...good luck with that

|||

David Frommer wrote:

The way I usually handle this is by using a business Key to uniquely identify each person (see EmployeeID)

Thnx David. I do have the business key of the employee in the dimension. However, if you create a parent-child dimension in Analysis Services, you need a key which identifies a unique row in the dimension table. I didn't mention this before, but this is one of the things for which I use the column.

Jeroen

|||

Rafael Salas wrote:

You should not try to get historical information of a row through its children...I gues that should be part of educating end users on what a SCD 2 is and what is not.

Although I'm still convinced the ParentID is not quite as useful if it doesn't give the same result as joining using the business key in combination with start- and enddate, I think this a quite a good conclusion to wrap this up :) thnx.

Jeroen

Wednesday, March 7, 2012

Parameters Changing Parameters

I have two parameters, Vendor and Status and I need to change the value of Status when the user changes the vendor from all to a single vendor and grey out the status parameter or just have it show all.

Any Guildence?.

Are you viewing your reports through the web browser, or through a viewer control in a program? I don't know if this is possible for doing it through the web browser. But if you're doing it through a program, you could write your own dialog for inputting the parameters, and then pass those along to the report viewer.|||I am working with the report through Visual Studio.|||

May be you need to read about "Cascading Parameters" to solve your problem.

Here is one link: http://msdn2.microsoft.com/en-us/library/aa337426.aspx

--Amde

|||I ened up finded that this could not be done. So I created a textbox that would come up when the users query would not return any results and tell them that they would need to select All to find out the info they were looking for.

Parameters changing in PageHeader

Hi Everyone.
I am working on a report that returns information on multiple people.
I would like the name of these people to be shown on every page.
I know we can't have fields in the PageHeader so i have created a Report
Parameter with no prompt and linked it via the default value section on the
report parameter screen to the fullname field in my datasource.
I added a text box which has the assignment of:
=Parameters!FullName.Value
My problem is that for every person returned in my report i have the same
name appearing in the PageHeader. so...
Is there a way to have this parameter dynamically changing depending on what
persons i have in my report? could i have possibly linked the field and the
parameter incorrectly?
--
Thanks in advance,
Dave HuntYou can group your report by =Fields!FullName.Value and place the name field
in the group header or you can place your report into a list box and group
your list by =Fields!FullName.Value and drop the name field in the list area.
U. Tokklas
"DustpanDave" wrote:
> Hi Everyone.
> I am working on a report that returns information on multiple people.
> I would like the name of these people to be shown on every page.
> I know we can't have fields in the PageHeader so i have created a Report
> Parameter with no prompt and linked it via the default value section on the
> report parameter screen to the fullname field in my datasource.
> I added a text box which has the assignment of:
> =Parameters!FullName.Value
> My problem is that for every person returned in my report i have the same
> name appearing in the PageHeader. so...
> Is there a way to have this parameter dynamically changing depending on what
> persons i have in my report? could i have possibly linked the field and the
> parameter incorrectly?
> --
> Thanks in advance,
> Dave Hunt|||In addition to what Tokklas said, you can remove the report parameter for
the FullName. Instead, you will have the FullName placed in a group header
textbox (e.g. with the textbox called "TextboxFullName"). Then in the page
header, you can just another textbox with the following expression:
=ReportItems!TextboxFullName.Value
-- Robert
This posting is provided "AS IS" with no warranties, and confers no rights.
"Tokklas" <Tokklas@.discussions.microsoft.com> wrote in message
news:8BE101C1-F7E7-42CD-B6C2-74203A61CA9D@.microsoft.com...
> You can group your report by =Fields!FullName.Value and place the name
> field
> in the group header or you can place your report into a list box and group
> your list by =Fields!FullName.Value and drop the name field in the list
> area.
>
> --
> U. Tokklas
>
> "DustpanDave" wrote:
>> Hi Everyone.
>> I am working on a report that returns information on multiple people.
>> I would like the name of these people to be shown on every page.
>> I know we can't have fields in the PageHeader so i have created a Report
>> Parameter with no prompt and linked it via the default value section on
>> the
>> report parameter screen to the fullname field in my datasource.
>> I added a text box which has the assignment of:
>> =Parameters!FullName.Value
>> My problem is that for every person returned in my report i have the same
>> name appearing in the PageHeader. so...
>> Is there a way to have this parameter dynamically changing depending on
>> what
>> persons i have in my report? could i have possibly linked the field and
>> the
>> parameter incorrectly?
>> --
>> Thanks in advance,
>> Dave Hunt|||So Guys...
I have a number of tables in a listbox.
Each table has its own datasource.
One of these datasources returns a fullname.
What is the process to assign a group or get an expression into a group
header?
is it a proerty somewhere?
What exactly would i need to do to get my name repeating on each page and
change when a new record starts?
Do you know if in the future it will be posible to have fields in the page
header?
I appreciate your help very much guys! thanks again
--
Thanks in advance,
Dave Hunt
"Robert Bruckner [MSFT]" wrote:
> In addition to what Tokklas said, you can remove the report parameter for
> the FullName. Instead, you will have the FullName placed in a group header
> textbox (e.g. with the textbox called "TextboxFullName"). Then in the page
> header, you can just another textbox with the following expression:
> =ReportItems!TextboxFullName.Value
> -- Robert
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
> "Tokklas" <Tokklas@.discussions.microsoft.com> wrote in message
> news:8BE101C1-F7E7-42CD-B6C2-74203A61CA9D@.microsoft.com...
> > You can group your report by =Fields!FullName.Value and place the name
> > field
> > in the group header or you can place your report into a list box and group
> > your list by =Fields!FullName.Value and drop the name field in the list
> > area.
> >
> >
> > --
> > U. Tokklas
> >
> >
> > "DustpanDave" wrote:
> >
> >> Hi Everyone.
> >>
> >> I am working on a report that returns information on multiple people.
> >> I would like the name of these people to be shown on every page.
> >>
> >> I know we can't have fields in the PageHeader so i have created a Report
> >> Parameter with no prompt and linked it via the default value section on
> >> the
> >> report parameter screen to the fullname field in my datasource.
> >> I added a text box which has the assignment of:
> >> =Parameters!FullName.Value
> >>
> >> My problem is that for every person returned in my report i have the same
> >> name appearing in the PageHeader. so...
> >> Is there a way to have this parameter dynamically changing depending on
> >> what
> >> persons i have in my report? could i have possibly linked the field and
> >> the
> >> parameter incorrectly?
> >> --
> >> Thanks in advance,
> >> Dave Hunt
>
>