Tallan Blog

Tallan’s Experts Share Their Knowledge on Technology, Trends and Solutions to Business Challenges

Category Archive for "Data & Analytics"

Introduction to Analysis Services

This post is intended to introduce Analysis Services 2005/2008 foundational concepts.
SQL Server Analysis Services is a component included in the Microsoft SQL Server product, and its use is fully covered under the same license (which applies to Integration Services and Reporting Services as well). Like the database engine, SSAS has a range of features stratified by Express, Workgroup, Standard Edition and Enterprise Editions.
SSAS is an OLAP, multi-dimensional database. While a relational data warehouse can also be described as OLAP, products such SSAS, Cognos and Essbase have fundamental differences. These OLAP servers:
· Aggregate data from a variety of sources into a compressed format optimized for query response
· Emphasize end-user navigation with modeling capabilities such as hierarchies, or drill-paths
· Employ MDX as the standard query language for retrieving result sets
· Use XMLA as the standard communication mechanism between clients and servers
· Automate…

SSRS 2008 Installation and Configuration

Reporting Services Installation
Reporting Services 2008 is included as part of the installation for SQL Server 2008. To install Reporting Services, select the checkbox that says “Reporting Services” under the “Instance Features” section.

As part of the installation, a Service Account will be configured for the Report Server.
The Report Server runs under a single Windows Service using either a Built-in account or Domain Windows user account.
The report server service contains:

Report Server Web Service
Report Manager
Background processing application for scheduled report processing and subscription delivery

MS BI News Roundup

Two big pieces of news on the Microsoft BI front:

Performance Point Server is being dismantled. After the next service pack, MS will cease develop of the product as its own entity. The Monitoring & Analytics capabilities will be bundled into the enterprise version of SharePoint. The Planning application looks to be dead in the water, however.Although PPS was complicated and unwieldly, it had potential and an interesting vision. I don’t see the failure of any part of the MS BI stack as a positive thing. The official press release and blog reactions are here.
While Microsoft had previously described a major-minor release schedule as the model for SQL Server going forward, the next version sounds like an interesting departure from that. Code-named “Kilimanjaro”, the next SQL Server release has been described as minor and BI-focused.Two components described thus far are “Madison”…

Partitioned Fact Tables

Once tables grow into the millions of records, they become candidates for partitioning. Table partitioning offers many benefits, particularly in warehouse environments. Since data is split into smaller units of storage, backups can target filegroups with a higher rate of change. Systems with multiple CPUs see improved query performance as partitioned data leads to greater parallelism. Perhaps most significant is the ability to swap in huge amounts of data by partition switching, an operation that is practically instantaneous.
Tables can be partitioned horizontally or vertically. With vertical partitioning, columns are split out into separate physical tables. This post focuses on horizontally partitioned tables, which take advantage of new constructs Microsoft added in the 2005 release – partition functions and partition schemes. Table partitioning is an Enterprise Edition only feature in SQL Server 2005 and 2008.

Introduction to SAP Business Objects Data Integrator ETL tool

SAP Business  Objects Data Integrator XI R2/R3
The purpose of this presentation is to familiarize the team with some of the features of Business Objects Data Integrator (DI). The presentation is also aimed at providing some insight into how technology can affect the design and implementation of  the ETL processes’.
While at a high level it is best that an ETL architecture be technology agnostic, the physical implementation can stand to benefit by being designed to take advantage of the features provided by the technology.

SSIS data management fundamentals

Loading data into a warehouse for use in business intelligence often involves loading data from multiple dissimilar sources. Many times the easiest way to consistently manage the data involves doing extracts from the source systems into a simple text format such as comma delimited. Even legacy systems have some form of reporting engines that allow for text file output, and in dealing with more current database types there are mechanisms such as BCP to allow for dumping output to a text file. A few more reason for using flat files – it allows for a snapshot in time that is replicable, it allows for a single output from a source system to easily be imported into multiple different warehouses, and allows for the imported data to easily be broken into manageable chunks by managing the text files.
SSIS has many features…

ETL Process Control Architecture

Data Volumes are growing at incredible rates across many IT departments. Businesses are dealing with new problems every day, and there is much demand on IT to be flexible and to react to many issues facing the business on a daily basis. Many times, processes become obsolete or disfunct because they cannot keep up with the rate of change and increasing demands by the business. In some cases, the users lose confidence in the systems and seek alternative methods for delivering reports and data in a timely fashion.
A centralized, standard Process Control Environment can provide an inexpensive and efficient way to provide an auditing mechanism on new and existing data migration processes which may run occasionally or very frequently.  This will enable an IT department to become more pro-active and aware of issues well before the end users are aware….

Developing a Dimensional Model w/ conformed dimensions and facts

Conformed Dimensions and Facts
Data marts are often developed to represent important systems within a company.  Over time additional data marts are added, and eventually there is a desire extract data across multiple marts.  Extracting data across data marts can be cumbersome and some times impossible if the data marts were not designed to share common dimensions, also known as conformed dimensions. Ideally since the dimension criteria would be the same for each data mart, selecting data across multiple marts would be as routine as selecting data from a single data mart.
Going hand and hand with conformed dimensions, conformed facts involve standardizing facts across multiple marts.  Adopting conformed facts eliminates the ambiguity of having facts that possess the same name but have different underlying calculations.
In this blog we use simple data marts to demonstrate the use of both conformed dimensions and…

SCD via SQL Stored Procedure

We are going to revisit the issue of dealing with Slowly Changing Dimensions in a data warehouse. We have seen a demonstration of using the SCD transformation that is available in SQL Server Integration Services (SSIS); however, this is not always the best option for processing an SCD. When the volume of rows you’re dealing with is substantial, this creates a significant, and usually unacceptable, performance hit. Another valid reason why you may choose not to implement the SSIS transformation is flexibility. The SSIS transformation requires your begin and end dates to be datetime data types, which could prove to be inconvenient when your SCD needs to tie back to a date and/or time dimension.
One alternative we are going to exhibit is using a SQL Server stored procedure. This example demonstrates the implementation of a Type 2 SCD, preserving the…

The BCP Option

Moving data from one repository to the next may sound like a trivial task, and in the sense of the normal mechanical approach it very well may be.  What could be simpler than INSERT INTO Here SELECT * FROM There?  Not much, it would seem, until your data source is something other than a SQL Server table or your data volume is so heavy that such a simple statement chokes, or worse yet, affects the transactional system’s performance.  What can you do when your data volumes are large enough that you need to consider faster approaches than manual SQL Statements?
Fortuitously, Microsoft has provided us with the BCP utility.  BCP is a command line executable that is optimized to move data very quickly from one place to another.  BCP is flexible enough to allow transformations of data in the operation, as…