Business Intelligence & Data Warehouse Tutorials: 2016

In this post I will provide a method how you can dynamically create a range of values.
I recently had to get a range of month values for filtering on a data set, but I had no date table on the database and needed to find a quick way how to solve this issue without going the route of creating a table and loading values.
So I ended up writing a recursive query using a Common Table Expression.

DECLARE @MinDate date = '2015-01-01',
@EndDate date = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0)

;WITH DateSerial AS
(
    SELECT @MinDate MonthDate -- 1 Invocation of the routine

    UNION ALL

    SELECT DATEADD(MONTH, 1, MonthDate)   --Recursive invocation of the routine
    FROM DateSerial
    WHERE MonthDate < @EndDate)

SELECT * FROM DateSerial

OPTION (MAXRECURSION 0)

In the first part I declared a start and end date to define the range.

A recursive Common Table Expression can be broken down into 3 parts:

1) Invocation of the routine
2) Recursive invocation of the routine
3) Termination Check

In the above sample a Common Table Expression is used where a first Select picks up the start date
and a UNION ALL combines the other results that are created until the end date is reached.

Depending on how big the range is, you are very likely to receive this error message at some point:

"The maximum recursion .... has been exhausted before statement completion"

To avoid this error message, the command OPTION (maxrecursion n) lets you define how often the Common Table Expression can recurse until it reaches an error state. In my sample I set the value to 0, meaning infinite recursion.

The querey delivered follwing result:

In this blog entry, I would like to describe how to reorganize or rebuild indexes.

First of all, SQL server stores row or index data on 8KB sized memory Pages. For each table (no matter how big), a own page is created. If the table exceeds a certain size due to manipulation of data (e.g. new row inserts), the page will be extended („Page Split“) to a second page etc.

The new page will also cause the creation of a new page in the upper level with the index tree, which may also lead to a page split.

Page Spilts are time consuming, cause fragmentation and index pages aren’t phyically ordered as supposed. All this results to a decrease of performance.

Let's have a brief look on the architecture of indexes in SQL Server.

In a table without a clustered index, the data will be stored within the page unsorted („Heap“).

Indexes have a root page, which is the starting point. The tree is split between non-leaf (root and intermediate leafs) and leaf-levels.

In a clustered index, the leaf level is the actual data page. This means the data is stored here in ordered form.

The most significant difference in the index architecture between clustered and non-clustered, is that the leaf level in non-clustered indexes contain key values and not the actual data.

In case of a heap, there is a row locator in the leaf node pointing to the correct data page.

A non-clustered index in a table that has a clustered index, will refer to the clustered index for pointing the row location.

Disadvantage here is that non-clustered indexes will have to be rebuild if the clustered index is rebuild or dropped. An advantage here is that a Reorganize of a clustered index, will mean that the non-clustered index does not have to be reorganized.

To solve the above listed issues concerning performance, indexes need to be REORGANIZEd or REBUILD.

The level of fragmentation can be analyzed with the help of system function sys.dm_db_index_physical_stats

Here is a more precise query:
SELECT sch.name + ' ' + tab.name             AS 'Table',
idx.name                                                       AS 'Index',
stat.avg_fragmentation_in_percent,
stat.page_count,
idx.type_desc                                                AS IndexType
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS stat
INNER JOIN sys.tables AS tab                   ON tab.object_id = stat.object_id
INNER JOIN sys.schemas AS sch               ON tab.schema_id = sch.schema_id
INNER JOIN sys.indexes AS idx                 ON idx.object_id = stat.object_id
AND stat.index_id = idx.index_id
ORDER BY stat.avg_fragmentation_in_percent DESC
If the average fragmention is between 5 and 30 %, a REORGANIZE is fine, if it's above 30% REBUILD is advised.

Following T-SQL syntax is required for REORGANIZE:

ALTER INDEX indexname

ON tablename REORGANIZE

In case of a rebuild, there is an option for a fill factor. This means the page will be filled with the percentage value defined. The rest is left open to avoid a fast fragmentation.

ALTER INDEX indexname

REBUILD WITH (FILLFACTOR = 80)

The fillfactor by default will only relate to the leaf level of the tree. If you want to also cover the other pages PAD_INDEX = ON will do the job.

ALTER INDEX indexname
REBUILD WITH (FILLFACTOR = 80, PAD_INDEX = ON)

Business Intelligence & Data Warehouse Tutorials

Donnerstag, 22. Dezember 2016

T-SQL: Dynamically Create Date Ranges With Recursive Queries

Freitag, 28. Oktober 2016

SQL SERVER ADMINISTRATION: REORGANIZE OR REBUILD CLUSTERED AND NON-CLUSTERED INDEXES