SQL Server 2016 Stretch Database Using Filter_Predicate
If you’ve ever been faced with the need to archive SQL Server data in order to save database space and/or improve query performance, you found that this would not be a simple task. Especially when the archived data must still be available to existing applications….
Previously. this could not be done without either database structure changes or application code changes or both. Potential solutions could involve partitioning the database table, which could help with performance, or changing the application queries to access different databases/tables in order to get current vs. archived data. Both potentially time consuming and intrusive remedies.
The Stretch Database functionality introduced in SQL Server 2016 uses Microsoft Cloud Services to make data archiving that is seamless to your applications possible. I have worked with Stretch DB in a lab environment and found that it has potential to solve some challenging data archiving problems.
Stretch Database is a new SQL Server feature that uses a hybrid of on-premises and Microsoft Azure databases to create a data archiving solution without of the above mentioned architecture changes. Using a simple configuration process, Stretch DB allows you to customize a process to move older SQL Server data to an Azure SQL database, freeing up local database space and speeding queries for current data. With Stretch DB, the location of the data is transparent to a query, making archived data and current data available to existing applications without any changes.
There are some limitations to Stretch Database, however. Here is a short list.
Stretch DB cannot migrate from a table that has another table referencing it with a foreign key. That is, a parent table in a parent-child relationship cannot be migrated, even if its children rows are already migrated.
Primary key and unique key constraints are not enforced in the Azure table. They are still enforced in the source table, however. But if a row exists in the Azure database, another row can be inserted into the source table with the same unique key value. And the duplicated row could be migrated to Azure despite its duplicate unique value.
Some datatypes (timestamp, XML) and column types (computed) are not allowed in a Stretch DB table.
There are other limitations to using Stretch Database. See the “Limitations …” link in the Resources section for a complete list.
There is a wealth of information on the web showing how to set up Stretch Database. So, I will not go into that in this blog, but will instead concentrate on one feature of Stretch Database. The Predicate feature allows filtering of the data that will be moved to the archive database. With filtering, you can configure which rows should be migrated based on some business logic. Without filtering, all of the rows in the source database will be moved to the Azure archive.
During the setup you will be presented with the following panel:
In order to continue, you must select the table or tables you want to stretch and then click on “Entire Table” in the migrate column for each table selected.
The default behavior is to migrate the entire table. If the “Choose Rows” radio is not checked, all rows will be moved to Azure.
To filter the rows to migrate, check “Choose Rows” and enter a name.
Select a column to filter from the “Where” drop down.
Then select an operation (equal, greater than, etc.) from the drop down and a value to check against.
Here, I picked a date column, selected the less than or equal to operation, and typed a date value.
The “Check” button does a syntax check of the entered settings.
It didn’t like the single quotes around the date here. Removing the quotes allowed the check to succeed.
Note that SQL Server creates a Table-valued function in your source database from these configured values. So, if you have naming conventions that you adhere to, you may want to pick a name that conforms to those conventions.
This is the function code generated by the above configuration:
The parameter @TransactionDate is derived from the column selected in the “Where” drop down.
And SCHEMABINDING is required to prevent columns referenced in the function from being altered.
Stretch DB filtering utilizes a CROSS APPLY between the source table and the function to select eligible rows to be migrated.
Rows returning from the query with an is_eligible value of “1” will be moved to Azure.
Once these setup steps are completed, the table data begins to migrate to Azure.
You can also add a Stretch DB filter to a table using T-SQL.
Create a function with the desired filter logic in your source database similar to the example above.
Then execute an ALTER statement against the source table setting the function in the “FILTER_PREDICATE” clause.
Map a table column for the function’s parameter. The migration state tells Stretch DB the direction of the migration.
Changing the filter value of an existing predicate turned out to be a bit cumbersome. I assumed that changing the code in the function would be all that was necessary to change the migration filter. It turns out that it was not that simple.
An attempt to ALTER or DROP an existing Stretch DB filter function results in the following errors:
Msg 3729, Level 16, State 1, Line 8
Cannot DROP FUNCTION ‘dbo.Trans_Date’ because it is being referenced by object ‘Stretch_Table’.
Msg 3729, Level 16, State 3, Procedure Trans_Date, Line 1 [Batch Start Line 17]
Cannot ALTER ‘dbo.Trans_Date’ because it is being referenced by object ‘Stretch_Table’.
This was because of the Schema Binding specified in the function. The function is referenced by Stretch_Table.
To by-pass these errors, I paused the migration and removed the filter function from the table. The following query ran successfully.
This paused the migration and removed the filter.
I then changed the logic in the function, changing the hard coded date value and ran the following:
This resulted in this error:
Msg 14840, Level 16, State 1, Line 31
The filter predicate cannot be set for table ‘dbo.Stretch_Table’ because all rows are already eligible for migration.
What happened here is that when the Stretch DB filter was removed from the table, all the rows became eligible to be migrated. All the source table rows were either in the Azure database or eligible to be migrated there. This prevented the application of any stretch filtering.
To recover from this, I had to first bring all the data back into the source database from Azure. This was done with the following query:
With all of the data residing locally, the predicate filter was re-applied with:
The migration to Azure then began again.
This process could be very expensive for a stretched table with many rows. A little research uncovered a better method to change the Stretch DB filter.
Create a new function with the new filter criteria (date in this case).
Then include it in the ALTER table query.
This replaced the filter function without interruption to the migration process. The old function can be deleted at this point. And, an automated process could be built around the steps needed to maintain a varying filter requirement such as date windowing.
The ALTER will check conditions where the function logic is valid. For instance, the new function predicate must be less restrictive than any existing predicate. That is because it cannot allow data that has already been migrated to Azure to become not eligible for migration under the new filter logic. Any invalid conditions encountered will result in an error similar to:
Msg 14841, Level 16, State 5, Line 232
The filter predicate ‘dbo.Trans_Date’ for table ‘dbo.Stretch_Table’ cannot be replaced with ‘dbo.Trans_Date2′ because conditions necessary to perform the replacement are not satisfied.
There are other restrictions on function filter logic, such as:
Subqueries are not allowed.
Filter expressions must be deterministic. For example, the getdate() function cannot be used in place of the date literal used in the sample function.
See the “Select Rows …” link in the Resources section for more information about using Predicates with Stretch DB.
The performance of the function query influences the performance of the migration. If indexes would improve the query, then implementing them will increase the performance of the Stretch DB migration process as well. An index on the TransactionDate column of the Stretch_Table table would improve the performance of this function.
I found that a unique index added to a table that was already stretched prevented the data from being migrated out of Azure and back into the source database (ALTER table with MIGRATION_STATE = INBOUND). Although the command succeeded, the inbound process did not run until the unique index was dropped.
You can see a table’s current Stretch DB filter by running the following T-SQL query:
It is important to note that there are costs incurred when using Stretch DB. These are monthly Azure charges for processing (compute) and data storage. There are multiple performance levels available to choose from based on your data volume and desired performance. See the “…Pricing” link for more cost information.
The Stretch Database functionality is an easy way to separate current and historical data in your SQL Server while still allowing seamless access to all data. Even with the current limitations, it can be an appealing solution for some data archive problems and could be useful in many organizations.