Tallan's Technology Blog

Tallan's Top Technologists Share Their Thoughts on Today's Technology Challenges

Posts Tagged "Warehousing"

Introduction to Analysis Services

This post is intended to introduce Analysis Services 2005/2008 foundational concepts.
SQL Server Analysis Services is a component included in the Microsoft SQL Server product, and its use is fully covered under the same license (which applies to Integration Services and Reporting Services as well). Like the database engine, SSAS has a range of features stratified by Express, Workgroup, Standard Edition and Enterprise Editions.
SSAS is an OLAP, multi-dimensional database. While a relational data warehouse can also be described as OLAP, products such SSAS, Cognos and Essbase have fundamental differences. These OLAP servers:
· Aggregate data from a variety of sources into a compressed format optimized for query response
· Emphasize end-user navigation with modeling capabilities such as hierarchies, or drill-paths
· Employ MDX as the standard query language for retrieving result sets
· Use XMLA as the standard communication mechanism between clients and servers
· Automate…

Partitioned Fact Tables

Once tables grow into the millions of records, they become candidates for partitioning. Table partitioning offers many benefits, particularly in warehouse environments. Since data is split into smaller units of storage, backups can target filegroups with a higher rate of change. Systems with multiple CPUs see improved query performance as partitioned data leads to greater parallelism. Perhaps most significant is the ability to swap in huge amounts of data by partition switching, an operation that is practically instantaneous.
Tables can be partitioned horizontally or vertically. With vertical partitioning, columns are split out into separate physical tables. This post focuses on horizontally partitioned tables, which take advantage of new constructs Microsoft added in the 2005 release – partition functions and partition schemes. Table partitioning is an Enterprise Edition only feature in SQL Server 2005 and 2008.

Developing a Dimensional Model w/ conformed dimensions and facts


Conformed Dimensions and Facts
Data marts are often developed to represent important systems within a company.  Over time additional data marts are added, and eventually there is a desire extract data across multiple marts.  Extracting data across data marts can be cumbersome and some times impossible if the data marts were not designed to share common dimensions, also known as conformed dimensions. Ideally since the dimension criteria would be the same for each data mart, selecting data across multiple marts would be as routine as selecting data from a single data mart.
Going hand and hand with conformed dimensions, conformed facts involve standardizing facts across multiple marts.  Adopting conformed facts eliminates the ambiguity of having facts that possess the same name but have different underlying calculations.
In this blog we use simple data marts to demonstrate the use of both conformed dimensions and…

SCD via SQL Stored Procedure


We are going to revisit the issue of dealing with Slowly Changing Dimensions in a data warehouse. We have seen a demonstration of using the SCD transformation that is available in SQL Server Integration Services (SSIS); however, this is not always the best option for processing an SCD. When the volume of rows you’re dealing with is substantial, this creates a significant, and usually unacceptable, performance hit. Another valid reason why you may choose not to implement the SSIS transformation is flexibility. The SSIS transformation requires your begin and end dates to be datetime data types, which could prove to be inconvenient when your SCD needs to tie back to a date and/or time dimension.
One alternative we are going to exhibit is using a SQL Server stored procedure. This example demonstrates the implementation of a Type 2 SCD, preserving the…

The SCD Transformation in SSIS


In this post, we will demonstrate the Slowly Changing Dimension (SCD) transformation that is available for use in a Data Flow task within SQL Server Integration Services (SSIS).  We’ll walk through an example of how to use it, explaining the terms as we go. In addition, I’ll demonstrate some “extra effort” kind of modifications that can be made for a more thorough usage.  After that, we’ll discuss the pros and cons of using it at all.  And we’ll wrap up with some tips on how to tune the transformation for better performance.

A Warehousing Overview

Data warehousing is a big subject. This overview is intended to cover some of the most representative issues on a high level: the nature of OLAP systems, star schemas, facts and dimensions, and differing perspectives (Inmon vs. Kimball) on warehouse design.
OLTP systems are the operational databases supporting applications. They are highly normalized, and focused on CRUD operations.
OLAP databases are usually arranged in star schemas and are built for speed in retrieving aggregated data.