Tallan's Technology Blog

Tallan's Top Technologists Share Their Thoughts on Today's Technology Challenges

SQL Server Analysis Services Multidimensional vs Tabular

Mark Frawley

It is no secret that Microsoft is moving away from the original “Multidimensional” (MD) flavor of Analysis Services, instead focusing all efforts on the new “Tabular” (Tab) flavor.  This drives the standard advice that Tab should be the basis of most new cube projects.  Another idea driving this is that for 60-70% of projects either technology likely will work fine – so it is prudent to go with the newer one which is much earlier in its lifecycle.

However, there are many differences between the two technologies.  In particular cases, one or the other is clearly the best fit, but without understanding these differences it is difficult to even assess whether your project is in the 60-70% where it doesn’t matter, or the 30-40% where it may.  Even if you would elect Tabular regardless to stay earlier in the product lifecycle, you should be sure you understand the tradeoffs.

Since MD is a much more mature product than Tab it is no surprise it has many more features and a better user and developer experience.  We hope that as Tab evolves it will eventually emulate at least some of these, and perhaps introduce new features unknown in MD.

Much of the information following can be found scattered on the internet, but this summary brings the most critical information related to deciding between MD and Tab together in one place.  What Tab lacks in functionality it makes up for in raw query performance, so it is the clear choice when that is more important than the missing features.  However, DAX and Tab are not as simple as they appear.

Unfortunately it is difficult, if not impossible, to tell a priori whether MD can provide the required performance, along with its rich functionality, or whether Tab has the better shot at that.  Typical POCs may not help because production-scale data may be needed to fully appreciate their relative performance.

The reader is assumed familiar enough with both MD and Tab that definition of terms is not needed – and that (s)he is able to assess the significance of each feature difference relative to the requirements that need to be met.

Feature/Capability Multidimensional (MD) Tabular (Tab)
Actions Supported Not supported (except via BIDSHelper with pre-2016 Tabular).
Aggregations MD supports both heuristic and usage-based creation of “aggregations” – summations at various grains created at processing time which are automatically used if possible at query time. Creating useful aggregations can be difficult, and their creation adds to processing time – but they provide another query performance-tuning option. Aggregations can be defined at a per-partition level which adds great flexibility in deciding what tradeoffs to make. Tab has no in-built notion of aggregations. In particular cases summary tables could be created in the source and added as additional tables to the Tab model, or created directly in the model as calculated tables – however, their use would either require new Measures based on them, or DAX code in existing Measures that determines when they can be used.
Attribute Modeling Support Attributes are a fundamental dimension abstraction in MD and can be modeled with numerous properties. Among the most important are:
1 – an attribute’s key and name are distinct properties and may be the same column or distinct columns from the source dimension table
2 – an attribute’s key can be simple (one column) or compound (multiple columns). This capability enables modeling “naturally unnatural” hierarchies as natural.
3 – in addition to key and name, an attribute can have a “value” property coming from yet another distinct column.
Tab does not have the abstraction of attributes – the analogue is a column in a table. The following important observations flow from this:
1 – an attribute’s key and name are the same column. “The key is the data”.
2 – by #1, compound keys are not supported. This lack means “naturally unnatural” hierarchies stay that way in Tab, with query performance impact.
3 – no separate “value” property is supported.
Attribute Property Support MD attributes have intrinsic properties common to all (e.g. Key or Name), and may have user-defined “properties” which are defined in terms of other attributes and relationships to them. Both can be accessed in MDX expressions and queries. No equivalent concept.
Calculated Members MD supports calculated members, which are calculations expressed in terms of attribute members rather than other measures – for example (and only example, one would never do this in practice), in a Time dimension, one could define a Quarter1 calculated member as Month.January + Month.February + Month.March. This calculation would then apply to any measure used. No equivalent concept.  In particular cases may be able to simulate by extending the model with a custom-built M:M structure.
Calculation Model A conceptual N-dimensional space where N = number of attributes from all dimensions. Any point in this space has coordinates (aka context) defined by a value from every attribute, and the total collection of points is the Cartesian product of all attributes’ member values, including the All member. MDX can address any of these points in a consistent syntax, with most coordinates implicit. Calculations occur in measures, which are MDX expressions, and via SCOPE statements in the cube calculation script. Measures can also define points outside the space. Tab does not present a conceptual “space”, rather calculations are controlled by complex interactions between DAX expressions, defined relationships and foundational concepts including “filter context”, “row context” and “context transition”. It shares with MD the idea of a “calculation context”, but little else.Knowledge of the mental model required to exploit MDX is actually a hindrance to “thinking DAX” – and vice versa.
Conditional Formatting of Measures Supported. Formatting can even be set by an MDX expression and thus be arbitrarily dynamic. Of course, any formatting is only metadata passed to the client, which may or may not implement it. Excel does. Not supported OOTB. May be achievable in limited contexts via hacks (such as having multiple versions of a measure with different formatting).
Critical Infrastructure for Performance 1 – fastest possible disks
2 – fastest possible I/O throughput (taking into account SAN and VM, if used)
3 – fastest possible CPU and multiple threads
4 – memory amount and speed is secondary, though must meet minimums that may need to be found empirically
1 – fastest possible memory and sufficient for database size (3-4 times)
2 – fastest possible memory bus speed and maximal CPU cache, taking into account CPU family
3 – fastest possible CPU and multiple threads
4 – disk speed and I/O throughput is secondary – as bottlenecks these will affect processing, initial load of cube into memory at server startup, and synch operations involving major changes..
Default Member Support All attributes have a default member property, which can be any member in the attribute or an MDX expression generating one – either of which can be extremely useful in some scenarios. Generally most attributes have the “All” member as their default member. Only the “All” member is implicitly supported as a default member. Any other value requires writing DAX code in a Measure or query, keeping in mind that Tab does not have attributes in the MD sense, only columns.
Developer Experience 1 – UI very developed in SSDT
2 – must have access to a MD server instance (can be local) to deploy to for processing and testing.
3 – each major object (cube, dimension) is a distinct file in the SSDT project, allowing some degree of concurrent development. Object metadata is in XML and includes noise related to the developer’s screen structure – merging changes to the same object is essentially impractical.
1 – UI relatively crude in SSDT, especially prior to the 2016 September release, which introduced much better cube navigation. Maintaining the model diagram in a legible form can be difficult. The Measures grid does not facilitate best-practice formatting of DAX expressions.
2 – prior to the noted release of SSDT, must have access to a Tab server instance (can be local) to host the “workspace” database.. As of the noted release this is still supported, but in addition SSDT can also spin up its own internal Tab instance, analogously to Visual Studio’s built-in web server, allowing stand-alone development.
3 – a Tab cube is a single “bim” file in the SSDT project. Concurrent checkout by multiple developers followed by merging may be more feasible than with MD, as the metadata is JSON and lacks noise present in MD’s XML metadata, but as a practical matter this is not likely to be viable.
Drillthrough to detail Supported. Only base (fact) measures may be accessed. Output is ugly, which can be addressed via Reporting Services or ASSP. Supported similarly to MD. One exception is that only an empty SELECT in a DRILLTHROUGH statement is supported.
Entitlement/Security MD supports a very granular and functional, role-based entitlement scheme for accessing data, down to individual “points” as described in Calculation Model, and with the option of Visual Totals. Entitlements can be expressed as both “deny” and “allow”, and can be driven by MDX expressions. The current username and connection string custom data are accessible. MD also supports a coarse-grained permission scheme for changing and processing the cube.? While the query performance degradation data security imposes may limit the maximum practical cube size to much less than it otherwise would be, it may still be a deciding factor in particular cases. Tab allows defining role-based filters on any table in a model, which restrict which rows can be accessed and can secure (hide) entire tables. In effect only a “deny” model applies. The current username and connection string custom data are accessible. The effects of such filters may be subtle and hard to predict, since the effect of such filtering will propagate through relationships and implicitly filter other tables. Visual Totals are in-effect the only type of total supported. Like MD, Tab also supports a coarse-grained permission scheme for changing and processing the cube.
Excel Support Excel and many other cube browsers (but not PowerBI) “speaks MDX” and queries cube metadata via XMLA. While Excel in particular is notorious for its sub-optimal MDX, at least it is speaking the native language to MD. Tab supports the same XMLA protocol as MD, the only difference being that it accepts DAX queries in addition to MDX queries. In the latter case the MDX is translated directly into low-level equivalent operations. This allows MDX-speaking browsers to present the same user experience of cubes, dimensions and measures as when working with MD, even though most of these constructs are foreign to Tab.The combination of sub-optimal MDX typically generated by Excel and other browsers, and MDX not even being the native language for Tab, means that in such scenarios Tab may be unable to reach its full performance potential that would be possible with an equivalent, well-written and native DAX query.
Hierarchy Support – General Strongly supported Weakly supported
Hierarchy Support – Natural vs Unnatural Rich capabilities for defining attribute key, enabling MD to model unnatural hierarchies (UH) as natural hierarchies (NH).As used here, “unnatural” means the same member key occurs at a given hierarchy level more than once – e.g. “January” in a Date dimension. Converting to natural requires either taking parentage into account via a compound key – or changing the data, e.g. “January 2010″.See also Attribute Modeling Support. Since the only option in Tab is that key = name = one column, a hierarchy may unavoidably be unnatural if the data cannot be changed to compensate (e.g. January -> January 2010). It so happens that UHs in Tab can have profound performance issues against Excel due to the MDX Excel generates. Supposedly this is fixed if the 2016 version of both Excel and Tabular are used together.See also Attribute Modeling Support.
Hierarchy Support – Navigation First class set of MDX functions and syntax specifically designed for hierarchy navigation. Very weak DAX support for hierarchy navigation leading to complex DAX code required in Measures or queries.
Hierarchy Support – Parent/Child Supported Not supported directly. You must flatten every level from the P/C hierarchy as a new calculated column populated with non-trivial DAX expressions, and setting of HideMemberIf property (pre 2016 release – not possible thereafter). Open-ended maximum depth not possible.
Hierarchy Support – Ragged Supported via HideMemberIf property Not supported (except via BIDSHelper with pre-2016 Tabular).
KPI Support MD supports server-defined KPIs, wherein each KPI parameter can be defined as an MDX expression, giving them the potential of great dynamism. KPIs are independent objects. Tab also supports KPIs. They are associated with measures. Only the base and target values can be dynamic (i.e. be measures). Unlike in MD, the breakpoints between different statuses cannot be dynamic – they are set manually in the KPI definition.
Language Extension Allows integrating .Net DLLs which leverage ADOMD.NET (also allows COM modules but not recommended). Exposed methods may be called from MDX queries, thereby extending it. The Codeplex project ASSP (Analysis Services Stored Procedures) illustrates this and has many useful methods. No DAX extension mechanism.
M:M Support MD supports M:M relationships between a dimensions and fact tables (measure groups), by representing the M:M mapping as a distinct measure group and then relating it to both another measure group and a dimension. The existence of the M:M relationship is transparent to measure definitions. The configuration is simple and is represented in SSDT in a clear way. Tab can support M:M relationships. Prior to the 2016 release (including SSDT) this had to be done by writing DAX code in each measure affected by the M:M, and was a complex workaround. As of the 2016 release, M:M is supported more natively via configuration of “bidirectional filtering” on a relationship. It is no longer necessary to write DAX, at least with server-based databases.
Maximum Database Size MD is primarily disk-based; in principle the maximum database size is constrained only by available disk space. On average the MD compression ratio (raw source data to cube files) is 3:1. As a practical matter, the larger the cube the simpler it must be. An MD cube larger than 1 Tb (thus 3 Tb source data) would be unusual, though the record is 24 Tb. Tab is 100% memory-based when querying – disk is only used during processing and to store the memory image when the server is stopped. As such, in principle the maximum database size is constrained by the amount of memory supported by hardware and budget. On average the Tab compression ratio (raw source data to cube files) is 10:1.In practice, if the same Tab server is used for processing, querying and synching, the required memory must be at least 3-4 times the largest database to allow for transient additional copies of it. Each distinct database simultaneously hosted on the same server also requires an increment of memory at least equal to its size.
Measure Aggregations Common aggregations (SUM, MIN, MAX, etc.) are supported as Measure properties and “just work”. Custom aggregations based on a dimension member (e.g. “+”, ‘-“, “~”) are also supported. Any form of aggregation requires writing DAX code in either extended columns, Measures or queries. When implemented as extended columns they are not query-context-sensitive.
Partitioning Support 1 – partition design can be used to tune both processing and querying performance. It may help both, or there may be a tradeoff based on many factors.
2 – only fact tables (measure groups) may be partitioned – dimension tables may not be.
3 – with proper configuration and CPU resources (threads), partitions can be processed and queried in parallel.
1 – partition design can be used to tune processing performance only. It has no impact on query performance.
2 – there is no distinction between dimension and fact tables – any table can be partitioned. This is a distinct advantage of Tabular.
3 – with proper configuration and CPU resources (threads), partitions can be processed in parallel.
Sensitivity to Structural Change In MD, most structural changes will require extensive reprocessing, if not a full process of cubes and dimensions. Any structural change to a dimension which requires it to have a full process (which is most, other than minor metadata changes like making a visible attribute hidden) will invalidate all cubes that use the dimension and require them to be fully processed as well. This is truly MD’s Achilles heel. In Tab, a structural change to a table generally only requires a reprocess of the table, not any other (e.g. fact) tables – and depending on the change, may not require a full process, either – adding/changing/removing extended columns, measures and hierarchies generally only requires a relatively speedy Process/’Recalc’. In this respect Tab is superior to MD in many scenarios.
Translations Supported Not supported
Writeback Support MD supports writeback, which is important in budgeting/forecasting applications. Not supported

 

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>