Início > Parent-Child > “Multiple parent” (parent-child relationship)

“Multiple parent” (parent-child relationship)

link:http://biwithjb.wordpress.com/2011/05/23/bom-bill-of-materials-scenarios-with-sql-server-analysis-services-2008-r2/#comment-22

While working on a Business Intelligence project where the data source is SAP R/3 and data is fetched directly from the SAP tables using SSIS 2008 with the Microsoft BizTalk SAP Connector; everything was smooth until one day I had to replicate a multiple parent parent-child relationship (aka bill-of-materials) that SAP creates in the Finance module.

These relationships occur when a child node can have more than one parent. Even though these types of relationships are not supported by default in SQL Server Analysis Services 2008 R2 I kept trying to implement them, until I’ve found a solution that works! In this document I explain how I did it.

In all bill-of-materials scenarios we have one fact table, one dimension table and one bridge or intermediate table that connects the fact table with the dimension table. This document is structured in 4 sections that will cover the 3 tables mentioned before plus one section for the required Analysis Services configuration.

Section 1 – Data Source Scenario

The data source tables are introduced and their structure explained in this section.  The fact table and its relationship with the dimension table are explained here.

This section will be quite familiar to those who have worked in a SAP environment, extracting data directly from SAP tables. If your scenario is not this specific SAP one, you can jump this section completely and go straight to section 2.

Section 2 – Building the Cost Element Dimension

A bill-of-materials dimension table has a slightly different structure than a normal dimension table in a non-BOM case. Section 2.3 covers this difference. The rest of the section details how to flatten the SAP hierarchy of nodes a leaves. If in your case you have a ready-to-go dimension table I still strongly suggest to read section 2.3.

Section 3 – Relate the Fact table with the Dimension table

This section describes the importance of the bridge or intermediate table that will connect the fact table with the dimension table.

Section 4 – Configuring the cube

Finally, section 4 details the configuration needed in Analysis Services 2008 R2 to make all parts work together in a bill-of-materials scenario.

Section 1 – Data source scenario

The specific solution covered in this document retrieves data for the Finance department to report on the expenditure of each department. In SAP terms that means the Cost Centers and Cost Elements, which are stored in the following tables’ structure:

Definition:

  • CSKS: Cost Center definition table
  • CSKA: Cost Element definition table

These two tables contain each individual Cost Center/Element identified by the OBJNR field in CSKS table and the KSTAR field in CSKA table.

Both Cost Centers and Cost Elements are structured in a hierarchy defined in the SETHEADERT, SETNODE and SETLEAF tables:

  • SETHEADERT: Description (long text) for the
    SETNODES
  • SETNODE: Declares for each node (SETNAME) which
    are its children (SUBSETNAME). This table is a normal recursive parent-child
    table. When a node doesn’t have any child it is considered a SETLEAF.
  • SETLEAF: Contains only the last members on the
    SETNODE hierarchy, i.e. the leaf nodes. Each leaf node spans a range of Cost
    Centers/Cost Elements (OBJNR/KSTAR); this range is declared in the VALFROM and
    VALTO fields.

The big problem is that each OBJNR/KSTAR can be in more than one SETLEAF and that one SETNODE can be in more than one SETNODE hierarchy.
This is known as a multi-parent parent-child relationship or bill-of-materials.

To complete the scenario and have an example let’s include our fact table.

COEP table is a fact table that stores for each expense in the company the amount and the cost center and cost element which that expense is related to.

For example, you buy a computer that goes to X (OBJNR) cost center and Y (KSTAR) cost element.

Relationships:

  • COEP is related with CSKS (Cost Center) by OBJNR field.
  • COEP is related with CSKA (Cost Element) by KSTAR field.
  • The rest of the fields are fact data.

Let’s have a simple example:

COEP Fact table:

  • OBJNR: KS100000120000
  • KSTAR: 0000404050

To make things simpler, in this document we will follow the solution for the Cost Element object. To implement the Cost Center object the same steps needs to be
followed.

The KSTAR 0000404050 appears in 2 leaf nodes (SETLEAF), PURSERV and MAINTREN. PURSERV node (SETNODE) appears under at least 2 different
parents as the following screenshot shows:

So, the expense that we have here needs to be reported under 2 different hierarchies but can’t be double billed if there is a parent node for the two hierarchy trees. This is a normal BOM scenario.

Section 2 – Building the Cost Element Dimension

2.1. Retrieve all SETNODE AND SETLEAF records

Using SSIS and the Microsoft BizTalk SAP Connector fetch all nodes and columns from the SAP tables described before and import them to a staging database in SQL Server.

The rows are filtered by the SETCLASS and SUBCLASS columns to retrieve only the ones that I’m interested. In my case SETCLASS = 0101 and SUBCLASS = QA10. These values could, and most probably would, be different in your company.

2.2. Detect the hierarchies and build the staging relationship tables

Hierarchy table for nodes (SETNODE)

Using a recursive stored procedure with a cursor inside, I build  a relationship table that contains all possible paths for the SETNODE hierarchies. The store procedure “walks” the SETNODE table and inserts rows in the Z_SETNODE_HIERARCHY table with the appropriate information.

Notice how the node with SETNAME equal to “109200” appears in 2 different hierarchies; one under “QA10_BJP” and the other under “QA10_CC”. It’s important to understand that this table has ALL possible hierarchy trees, that’s why node “109200” appears multiple times.

Complete LEAF information

A second table will store the complete SETLEAFs information crossed with the definition table CSKA. This table will be called Z_CSKA_SETLEAF.

To build this table I’m using another stored procedure, called p_Z_CSKA_SETLEAF.

Hierarchy table for LEAFs (SETLEAF)

A third table containing the relationships between SETLEAFs and SETNODEs is needed. In this table we have one entry for each KSTAR (SETLEAF) related to a SETNODE. 

If you are interested in the stored procedures that create the tables described above please click in the following link to a Word document which contains the code for the stored procedures in the order that should be executed: BOMStoredProcedures

Note: To execute the stored procedures I use the Execute SQL Taskin SSIS.

The resulting tables from this process are:

2.3. Create the Dimension table DimCostElement

With the previous 2 tables now we can load them into ONE single Dimension table, DimCostElement. First we will load the nodes from the Z_SETNODE_HIERARCHY table and later the leaves from theZ_CSKA_SETLEAF_SETNODE table.

By joining these two tables we have “flattened” the hierarchy. Now as leaves and nodes are in the same table, they can be considered the same object type.

Remember that this Dimension table contains ALL the possible paths in the hierarchy, so a child will appear multiple times in the table. Also remember that a child now can be not only a leaf node member but also normal node member. There is no more distinction between leaf and node members.

The structure of this table is very important, let’s have a deeper look at it:

DimCostElement table contains a CostElementID and a ParentID field. The CostElementID field acts as ID for the table, while the ParentID field just refers to the former to create a parent-child relationship.

The SETNAME field contains the SETNAME for the SETNODEs and the KSTAR for the LEAFs. In this way we have unified the SETNAME for the two different object types. Be aware here that one SETNAME can have multiple CostElementIDs due to that SETNAME will appear in more than one hierarchy path.

The concept of decoupling the name of an object from its database ID, allowing the object name to have more than one ID is what allows us to have multiple-parent parent-child relationships. This is the most important concept to learn in this exercise.

The last two fields are description fields.  

Section 3 – Relate the Fact table with the Dimension table

The fact table, COEP, has for each entry one KSTAR field that relates the described expense with one Cost Element in the dimension table; but because one Cost Element can be in more than one hierarchy the relationship between the fact table and the dimension table cannot be direct. We need an intermediate table.

The intermediate table pairs each KSTAR from the fact table with all possible CostElementIDs in the DimCostElement table. Let’s remember that in our case the KSTAR field from the fact table contains the same value (here is the relationship) as the SETNAME field from the DimCostElement table, but because of the multi-parent parent-child hierarchy nature of the Cost Elements,
one SETNAME can have multiple CostElementIDs. The intermediate table stores for each KSTAR from the fact table all the CostElementIDs from the DimCostElement table (many-to-many relationship).

You can create another stored procedure that fills the fact-less table with the result of a join query between the fact table and the dimension table.

In SQL Server 2008 this type of intermediate tables is known as Fact-less tables. You can search the Books Online for further information about them.

In this solution we cross one of the golden rules in data warehousing. KSTAR values in the Fact table should be replaced by their equivalent CostElementID from the DimCostElement table. But because one KSTAR value can have more than one CostElementID (remember that one KSTAR can be in more than one leaf member), this is a better solution that saves us of creating another intermediate table.

With this last table the database is ready for Analysis Services. In the next section I describe how to configure the cube to allow the BOM relationships and properly aggregate them.

Section 4 – Configuring the cube

4.1. Adding the fact and dimension tables

First step will be to add the 3 previous tables (fact, dimension and intermediate) to the data source view and create the appropriate relationships between tables (if you are as me that I don’t create any relationships in the database itself).

Create an empty cube and let’s add the tables manually to fully control how Analysis Services understand them.

The 3 tables will be added and configured in the cube as follows:

  • The Fact COEP table will be added to the cube a fact table with one measure group.
  • The intermediate Fact-less table will be added to the cube as both a dimension and a fact table.
  • The DimCostElement table will be added to the cube as a dimension table.

The Fact-less table should only be configured as a fact table, as it doesn’t contain any data, but SSAS needs 2 dimensions when configuring many-to-many relationships. The reason behind this is that SSAS approaches many-to-many relationships as 2 regular relationships put together.

You can add first the Fact-less table as a dimension and later as a fact table or on the opposite order, it really doesn’t matter. To not confuse the final user set the “Visible” attribute of the dimension to “False” and do the same for each measure of the fact. This will prevent the false dimension and fact-less table to appear in any client software that queries the cube.

4.2. Configuring the Fact-less dimension

Attributes:

The Fact-less dimension will have 3 attributes:

  • Factless-Cost Element ID: This attribute is the field CostElementID field from the Fact-less table
  • FactCOEPActual – KSTAR: This attribute is the KSTAR field from the Fact COEP table (main fact table)
  • DimCostElement – Cost Element ID: This attribute is the CostElementID field from the DimCostElement table

Note: I like to name the attributes in these special tables with the name of their source table preceding their name for clarity purposes.

These 3 attributes are needed to later build the many-to-many relationships between the tables, as it is explained in the next section of this document.

4.3. Configuring the Dimension Usage

The Dimension Usage tab allows us to define the relationship between the dimensions and the measure groups (fact) contained in a cube. It is here where we can tell Analysis Services that we have a multi-parent parent-child between our 3 tables.

The screenshot below shows the final scenario for the Cost Center (Factless OBJNR Cost Center) and Cost Elements (Factless KSTAR Cost Element). As this document only covers the Cost Element case, let’s focus our attention on the Actual Values and Factless KSTAR Cost Element columns/measure groups and Dim Cost Element (Cost Element) and Factless KSTAR Cost Element rows/dimensions.

In this document I will not describe the different types of dimension relationships. I encourage you to read this topic in the Microsoft MSDN here: http://msdn.microsoft.com/en-us/library/ms175669(v=SQL.105).aspx

First relationship

(Dimension) Factless KSTAR Cost Element with (Measure G)
Factless KSTAR Cost Element:

This is a Regular relationship as the dimension and the measure group are jointly directly by a key column.

Configure the relationship as in the following screenshot:

Second relationship

(Dimension) Dim Cost Element with (Measure G) Factless KSTAR
Cost Element:

This is a Regular relationship as the dimension and the measure group are jointly directly by a key column.

Configure the relationship as in the following screenshot:

Third relationship

(Dimension) Factless KSTAR Cost Element with (Measure G) Actual Values:

This is a Regular relationship as the dimension and the measure group are jointly directly by a key column.

Configure the relationship as in the following screenshot:

Fourth relationship

(Dimension) Dim Cost Element with (Measure G) Actual Values:

This is the most important relationship to define. The Cost Element Dimension is related in a many-to-many fashion to the fact table. Analysis Services needs of an intermediate table (fact) in order to build the relationship.

As our intermediate table has all possible paths for all the hierarchies and the dimension table has a parent-child relationship built-in, this combination will give us the bill-of-materials scenario.

Configure the relationship as in the following screenshot:

After all the configuration is done, build the cube (first build the dimensions and later the measure groups) and start enjoying a multi-parent parent-child relationship (or Bill-of-Materials) in SQL Server Analysis Services 2008.

Categorias:Parent-Child
  1. 2013/06/24 às 5:11

    Hey just wanted to give you a quick heads up. The
    words in your content seem to be running off the screen in
    Internet explorer. I’m not sure if this is a format issue or something to do with web browser compatibility but I thought I’d post to let you know.

    The design and style look great though! Hope you get the issue resolved soon.
    Thanks

  1. No trackbacks yet.

Deixe uma Resposta

Preencha os seus detalhes abaixo ou clique num ícone para iniciar sessão:

Logótipo da WordPress.com

Está a comentar usando a sua conta WordPress.com Terminar Sessão / Alterar )

Imagem do Twitter

Está a comentar usando a sua conta Twitter Terminar Sessão / Alterar )

Facebook photo

Está a comentar usando a sua conta Facebook Terminar Sessão / Alterar )

Google+ photo

Está a comentar usando a sua conta Google+ Terminar Sessão / Alterar )

Connecting to %s

%d bloggers like this: