Analysis Services Tabular: Many-To-Many Relationships, Bridge Tables, and Blank Members
With bidirectional filtering in SQL Server Analysis Services (SSAS) Tabular 2016, it’s easier than ever to build many-to-many relationships into your model. But what are some ways to avoid trouble when building them? This post covers two topics: (1) a scenario that can cause Tabular to match completely unrelated groups across a many-to-many relationship, and (2) some strategies for automating your bridge tables (including where they might need some brains!). If you’re already comfortable with this topic, and just want to see some DAX, feel free to skip right to the calculated bridge tables.
Otherwise, let’s say you’ve built up your many-to-many relationship based on a bridge or relationship table someone generated in the database, and have measures based on one or both of the tables connected by the bridge. When looking at the results, you see that a number of the results appear to be duplicates, and some of the duplicated rows shouldn’t even be there! Having been in this situation myself, I took apart my model and found a surprising result: the unmatched members on both sides of my bridge were actually matching each other as a new, and blank, group.
How does this happen in Tabular? In Multidimensional there is a concept called the “Unknown Member”, which you can set some of the display properties for. However, as with some of the other controls from Multidimensional, this was specifically left out of Tabular as low priority.
This Unknown Member is generated whenever a relationship does not find a match for one end, and in Tabular is represented by a new BLANK() member. Since a column with BLANK() values cannot fill the “1” role in a 1-to-Many relationship, this creates a unique situation where your bridge table effectively has an imaginary BLANK() row by default, which matches everything from another table that does not otherwise have a match, including actual BLANK values. This can get complicated when one or more sides of the relationship have BLANK or unmatched values, as this means that the virtual BLANK in the bridge is legitimately joining multiple items.
Consider the following tables:
Figure 1. A small Tabular model diagram indicating our example table relationships. The relationships are bi-directionally filtered to support the Count measures.
|Item A||Value||Item B||Value|
Tables 1-3. The three Item tables, and their contents for this example. Both A and B include a “BLANK” item.
If we were to generate Pivot Tables in Excel from this model, we might see the following:
|Item A||Item B||Item A Count||Item B Count|
|F||1||1||Item A||Item A Count||Item B Count|
Tables 4-5. Example Pivot Table outputs from using Item A, Item B, and the two Item x Count measures.
There are three important cases within these tables:
- Matched by the A-Bridge relationship AND the B-Bridge relationship (B, C)
- Matched by just one of the two relationships through the bridge (A, E)
- Matched by neither relationship (BLANK, D, F)
The last case is represented by the cross-join of the items that appeared as blanks, which is why both Item A.D and Item A.(BLANK) correspond to Item B.(BLANK) and Item B.F independently in the first table. This is made even clearer by dragging the Item column from the Item Bridge table onto our Pivot Table:
|Item A||Item B||Item||Item A Count||Item B Count|
|F||1||1||Item||Item A Count||Item B Count|
Tables 6-7. The example Pivot Table from before, sliced by the Item Bridge’s Item column. Note that same blank value generated by Excel to mark the unmatched cell is the same blank member causing issues when summing across the relationships.
So as you’ve probably already realized, the issue is not that items did not have matches, but that both sides, when unmatched or blank, match to a virtual BLANK member, and this can establish what SSAS considers a valid many-to-many relationship.
Avoiding the Unknown Member
Of course, with the power of DAX and the ability to filter parts of the model easily, you can modify your relevant measures to filter out the generated BLANK row. In fact, there’s a filter function designed explicitly for this: ALLNOBLANKROW(), in addition to straightforward filters like placing a NOT(ISBLANK(…)) condition on the bridge relationship columns.
While solutions have been made for emulating a customized Unknown Member, the real key is in keeping track of the relationships you’re implementing. Then you can make sure that you only allow unmatched members when you’d like to, and only for the side of the relationship that is appropriate. Since the Unknown Member “issue” can be hard to notice, let along debug, and since adding blanket filters is usually a last-resort, this pro-active approach can cut down dramatically on both debugging and development time.
Calculated Bridge Tables
If you don’t have reliable source for your bridge tables already, or if you want to make sure that they stay up-to-date automatically, consider using the following DAX patterns to construct a calculated table as a many-to-many relationship bridge.
Throughout these examples, especially if you’re working with a similar test data set, the engine may not correctly calculate the cardinality. In this case, if you try to use the drag-and-drop approach to building the table relationships, you may run into issues. If you start to see issues about circular dependencies, try to verify that you don’t have any looping filter paths, and create the relationship manually by going through the Create A Relationship dialog. If you receive errors about BLANK values not being allowed, try to add in a BLANK filter to your table DAX, or remove these extra BLANK values from your dataset.
If you already have a data source that contains all of the values you expect to see, or at least all of the values you’re interested in, consider using the following pattern:
Single Bridge:=/*VALUES('Item A'[Item A])*/ // You can use this pattern when you don't need to apply any filters, or if you want the process to fail if a BLANK would be added CALCULATETABLE(VALUES('Item A'[Item A]),NOT(ISBLANK('Item A'[Item A]))) // Or you can filter the result to remove BLANKs
This works whenever you need to filter one table primarily, such as with a fact connecting to a dimension, or if you want to make sure the BLANK row is used when the relationship is missing on one side. However, if you need to only show true matches, or if you need to account for more tables, you might want to consider another option.
|Item A||Item B||Item A Count||Item B Count|
Table 8. Pivot table output from a “single” bridge table. The BLANK member from A becomes a distinct group from items in A that have no coordinating item in B.
Calculating a bridge table to contain all of the values in the tables you’re trying to join is a good way to eliminate mismatches without needing to write a new view or process to generate the data in your database.
Union Bridge:=DISTINCT(ALLNOBLANKROW(UNION(VALUES('Item A'[Item A]),VALUES('Item B'[Item B])))) // The distinct set of values from both tables, based on the column name from the first table, and with blanks removed.
In this case, it’s also extensible to multiple tables if your “bridge” is starting to look more like a “hub”. However, it’s important to include the ALLNOBLANKROW() step, as this eliminates BLANK values from the union, which would prevent the table from being used as the “1” in any 1-to-Many relationships.
|Item A||Item B||Item A Count||Item B Count||Item B||Item A||Item A Count||Item B Count|
Tables 9-10. Pivot Table output from a “union” bridge table. The displayed column hierarchy determines where the cut-off occurs, but only full matches are related, as expected.
Sometimes, though, you need to capture more than just the set of combined options. For instance, let’s say that Item A and Item B are actually bridged by value. If you needed to calculate such a table, you might need a pattern such as the one below, which precalculates the join condition(matching by the Value columns on the original Item A and Item B tables), then filters itself to display the appropriate and non-BLANK combinations:
Filtered Bridge:= SELECTCOLUMNS( FILTER( ADDCOLUMNS( GENERATEALL('Item A','Item B'), "Difference", 'Item A'[Value] - 'Item B'[Value] ), [Difference] = 0 && NOT(ISBLANK('Item A'[Item A])) && NOT(ISBLANK('Item B'[Item B])) ), "Item A",[Item A], "Item B",[Item B] )
This pattern can be especially useful if you need to create a series of complex bridges, and there is some secondary filter that needs to be applied. Since you’re calculating it, you can automatically adjust it to capture all of the data you’ve loaded, or to filter it to match conditions on these other tables that may change later on, such as maximum ranges.
|Item A||Item B||Value||Item A Count||Item B Count|
Table 11. Pivot Table output from a filtered bridge matching by Value. Since each item had a unique value, each side is matched completely, including the BLANKs.
Filtered Bridge Use Case
Just to show an example, let’s say that you’re building a historical database, and need to align the age of different historical figures with other events that occurred in that year. Rather than generating this information in your source database and adding another required table refresh, you decide to generate it automatically as a calculated table.
In this case, if events have a marked year, and each historical figure has a birth and death year, a calculated bridge could generate the set of all years that a figure was alive, and use that to relate figures, events, and relative age. See the example DAX table statement below:
Historical Figure Age:= SELECTCOLUMNS( FILTER( ADDCOLUMNS( GENERATEALL(Figures,Years), // Using a historical figures table, and a set of years, generate all combinations "Age", Years[Year] - Figures[Birth Year] // and assume that age is simply a difference in year, for convenience. ), [Age] <= Figures[Death Year] - Figures[Birth Year] // This relationship is only valid while the person is alive! && [Age] >= 0 ), "Figure Name", Figures[Name], // And now you can connect to figures "Year", Years[Year] // and years "Age", [Age] // and augment this with our calculated age )
Given the sample model below, and some quick measures, it’d be simple to generate the following report:
|Age At Event||Event|
|Figure Name||WW1 Begins||WW1 Ends||WW2 Begins||WW2 Ends|
|Queen Elizabeth II||13||19|
Given how Tabular handles blank and unmatched members, be on the lookout for situations where you may be missing members on either side of a many-to-many bridge.
Some ways of getting around this issue, and to prevent needing to store bridge tables in your source data, are to generate them as calculated tables. This lets you ensure you have constantly up-to-date bridges, even if you need to generate a bridge with built-in business or reporting logic.
Are there any situations where you’ve run into the extra BLANK matching group, or where you’d like to generate your bridge tables automatically? Please feel free to tell us below!