Dienstag, 26. September 2017

SQL SERVER ADMINISTRATION: FULL, DIFFERENTIAL AND TRANSACTION LOG BACKUP TYPES



In this tutorial, I will go through the different types available for creating Backups, which will be essential for your decisicion on which Backup Strategy to go for.
I will mainly focus on Full, Differential and Transaction Log Backups.

FULL DATABASE BACKUPS
This backup type will include all objects (views, procedures, functions...), the tables including the data, users and rights. 
A Full Backup enables you to restore the database as it was at the time the Backup was created. Transactions occuring during the Backup process will also be captured.

DIFFERENTIAL BACKUPS
Differential Backups will capture the data that has been altered since the last Full Backup. So when restoring, you will need the Full + Differential Backup files.
An advantage is that Differential Backups are faster than Full Backups (could be a huge difference depending on the size of the Database).

TRANSACTION LOG BACKUPS
Transactions Log Backups will capture all changes that occured since the last transaction log backup. Within this process, a cleaning of the transaction logs happens (a removal of transactions that have been comitted or cancelled).
This Backup Type captures data up to the time the Backup Process was started, whereas Full and Differential Backups also capture transactions that are altered during the backup process.

FILE AND FILEGROUP BACKUPS
This type is usually used for large Databases. Basically specific Files and Filegroups are backed up.

COPY-ONLY BACKUPS
Copy-Only Backups will deliver the same results as a Full Database Backup or Transaction Log backups. The reason for using a Copy-Only Backup is that this Backup type will be ignored, meaning a Copy-Only Backup initiated between a Full and Transactional Backup will not affect the timestamp of the last Full Backup.

You can create a Backup either manually or with T-SQL Command.
If you decide to do it manually, right click on any Database -> Tasks -> Back Up

T-SQL:

--Full Backup
BACKUP DATABASE Databasename TO [YourPath]

--Transactional Backup
BACKUP LOG
Databasename TO [YourPath]

--Differential Backup
BACKUP DATABASE
Databasename TO [YourPath] WITH DIFFERENTIAL



















Select the Database, the Backup Type and Destination



















To get a better understanding of the different Backup Types, let's have a look at the table below.
At the time I created the Backup, it appears to be empty:














I inserted 1 country (Sweden) into the table and created a Differential and a Transaction Log Backup.














Finally I inserted another row by adding Netherlands to the table and created another Transaction Log Backup.

















RESTORING BACKUPS

Now that we have created our Backups, will need to restore them. Right click on the Destination Database -> Tasks -> Restore -> Database












On the General tab, select device and click on the ... button to select location of the stored Backup Files.
You will also need to select a Destination Database.
On the Files tab, ensure your files are pointing to the right folder (and not of the original source)














On the Options tab, select Overwrite and RESTORE WITH RECOVERY.










To restore a FULL Backup you need to select the Full Backup File:





To restore a DIFFERENTIAL Backup you need to select the Full Backup File  + Differential Backup files:

 










To restore a TRANSACTION LOG Backup you need to select the Full Backup File  + Transaction Log files:












As you can see in the above picture you have the option to restore to any time where an an Transaction Log Backup was created.
To get a better picture I will summarize the 3 Backup Types in relation to entries in the Country table.




Dienstag, 23. Mai 2017

SQL SERVER ADMINISTRATION: EXTRACT SHOWPLAN RESULTS FROM SQL SERVER PROFILER FOR FUTHER ANALYSIS



The Estimated or Actual Execution Plan provided by SQL Server gives a nice graphical overview, but can be hard to read if you have too many objects and you will also need to point your cursor on each object for further information. There is also an option to extract the Execution Plan in XML, but that requires additonal steps to make it readable. In this tutorial I will show an alternative way with SQL Server Profiler that will provide detailed execution plan information in text form that can be easily reused for further analysis.

 Once you've open SQL Server Profiler click on the File tab and then on New Trace.











SQL Server Profiler returns massive information. To narrow down the results and to focus on our main goal, some settings are essential. In the Trace properties click on the Show all events tab and expand the Performance section to capture the Showplan data.

















Under Performacnce you will see all types of Showplan information. The one most relevant for our excercise here is Showplan All.
To narrow down the results even more, click on Column Filters in the Event Selections Tab. Depending on column and operator you can basically filter on any available column with an exact serch term or use wildcards. I will filter on my login to avoid getting system admin records returned.


















Meanwhile I executed my query from SSMS and SQL Server Profiler immediately returned following records:













Showplan All returns the execution plan and in addition detailed information such as what type of join operator is used, CPU costs, nr of executions etc. Simply mark the relevant text and right click copy.













You can save this info as a txt file or simply paste it in Excel. Once you have it in the right format, the whole thing is very easy to read and you can sort descending by the most expensive objects to lead you straight to the root cause.







Samstag, 11. März 2017

T-SQL: PARSING VALUES BASED ON NTH OCCURENCE OF A CHARACTER IN A STRING



I recently had to parse values stored in a column that had following pattern: -'Word1-Word2-Word3-Word4-' etc.
SQL Server already has the built in function CHARINDEX() which is quite helpful, but the issue is that it only returns the position of the 1st occurence.
Technically I could nest one CHARINDEX() in another one, but that would create a big piece of code, is a little hard to read plus not a dyamic solution. So, I decided to create a function that returns the values based on input parameters:

CREATE FUNCTION dbo.ParseStringValue(
@SplitCharacter nvarchar(5),
@TargetString nvarchar(50),
@NthOccurence int)

RETURNS nvarchar(50)

AS
BEGIN

DECLARE @Pos1 int = 0,
@Pos2 int = 0,
@Count int = 0,
@Result nvarchar(25)

WHILE @Count <= @NthOccurence
BEGIN

SET @pos1 = @pos2
SET @pos2 = CHARINDEX(@SplitCharacter, @TargetString, @pos1 + 1)
SET @Count = @Count +1

END

SET @Result = SUBSTRING(@TargetString, @pos1+1, (@pos2-1)- @pos1)

RETURN (@Result)
END


I will use this string for our sample '-Aziz-Sharif-blogspot-com-' and will search for the value behind the 4th occurence of the split character '-'

The function will be called in the SELECT part and I need to pass 3 parameters 
@SplitCharacter: The seperator value '-'
@TargetString: The string value - here a fixed string for demonstration, could be a column as well
@NthOccurence: The nth occurence of a character in a string I wish to be returned

SELECT dbo.ParseStringValue('-', '-Aziz-Sharif-blogspot-com-', 4)

The while loop will go through the same piece of code and stop until it reached the nth occurence:

WHILE @Count <= @NthOccurence
BEGIN

SET @pos1 = @pos2
SET @pos2 = CHARINDEX(@SplitCharacter, @TargetString, @pos1 + 1)
SET @Count = @Count +1

END


I will run the query in debug mode and break down the values of each run.











Within the loop I define the positions of the split character. Since I need to find the values between 2 characters, I have declared 2 integer variables: @pos1 and @pos2.


SET @pos1 = @pos2
SET @pos2 = CHARINDEX(@SplitCharacter, @TargetString, @pos1 + 1)


@pos2 is passed to @pos1. So @pos1 stores the most recent value and @pos2 returns the next position after @pos1 of the split character I'm searching for:
SET @pos2 = CHARINDEX(@SplitCharacter, @TargetString, @pos1 + 1)
The @pos1 + 1 part will define ther start position .

The final relevant values stored are:
@pos1 = 22 -> '-Aziz-Sharif-blogspot-com-'
@pos1 = 26 -> '-Aziz-Sharif-blogspot-com-'

Since I need to extract the value between the 2 split characters, I need to add 1 position to the start position and deduct 1 from the end position:
SET @Result = SUBSTRING(@TargetString, @pos1+1, (@pos2-1)- @pos1)

The function returns the value after the 4th occurence in the defined string









Sonntag, 26. Februar 2017

T-SQL: JOIN OPERATORS - NESTED LOOP JOIN, MERGE JOIN AND HASH JOIN



In this tutorial, I will describe the 3 Join Operators SQL Server uses and the reason why it comes to the individual selections by the Optimizer.

I will focus on the Inner Join and will stick to using 2 tables. We basically have 3 scenarios:

1) None of the tables have an index on the joining column
2) One of the two tables has an index on the joining column
3) Both columns have an index on the joining column

In the samples below I will use 2 tables from the AdventureWorks Database and modify indexes to demonstrate why the Optimizer decides to vary on the join operator. Following 2 tables will be used: Sales.SalesOrderHeader
Sales.SalesTerritory

Following query will be used throughout all samples:
SELECT  so.SalesOrderID
      ,st.Name
      ,so.OrderDate
      ,so.DueDate
      ,so.ShipDate
      ,so.SalesOrderNumber
      ,so.PurchaseOrderNumber
      ,so.AccountNumber
      ,so.CurrencyRateID
      ,so.TotalDue
  FROM Sales.SalesOrderHeader so
  INNER JOIN Sales.SalesTerritory st    ON so.TerritoryID = st.TerritoryID


NESTED LOOP JOIN

Tablename Row Count Indexed Column
Sales.SalesOrderHeader 31465 Clustered Index on joining column
Sales.SalesTerritory 10 No index

 








Nested Loop Join will consist out of inner (lower object in Execution Plan) and outer (upper object in Execution Plan) inputs.
The outer input gets scanned row by row and an index seek is processed on the inner table (the larger indexed table).
This is exactly the advantage of Nested Loop Joins compared to Hash and Merge Joins. The outer, smaller input, is placed in memory and is compared with an indexed outer input.
Let's have a closer look at the Execution plan by placing the cursor above the object in question. The Clustered Index Scan shows us that there was 1 Execution returning 10 rows (all rows in that table).




















The Clustered Index Seek provides information that 10 Executions took place returning 31465 rows. Therefor keep in mind that Nested Loop Joins will always work well with a small outer input and a indexed inner input.
























MERGE JOIN

Tablename Row Count Indexed Column
Sales.SalesOrderHeader 31465 Clustered Index on joining column
Sales.SalesTerritory 10 Clustered Index on joining column









The requirements for a Merge Join is that both tables are sorted on the joining column. Assuming you have an index on both columns, then the sort is already covered here.
If you force a Merge Join by passing a hint and an index is missing on one or both of the columns, the Sort Operator will be used here, which is quite expensive due to high memory and I/O resources.
Merge Join can be very fast when columns are ordered by the index. In this case matching rows are created while the sorted columns are compared for equality.
The big advantage with Merge Joins is that both outputs will be only executed once.















Although I have an index on both tables, the Optimizer will probably still go for a Nested Loop Join here, since the outer input is quite small. Merge Joins are perfect for indexed and larger tables.

HASH JOIN
Tablename Row Count Indexed Column
Sales.SalesOrderHeader 31465 No index
Sales.SalesTerritory 10 No Index












Hash Joins are usually selected when there is no other option, meaning the data is not sorted and nonindexed. A Hash Join consists of 2 inputs.
1) Build Input: the upper object in the execution plan and usually the smaller table since it is saved on the system and to keep used memory low
2) Probe Input: the lower object in the execution plan
Hash joins are used by the Optimizer to process large and unsorted as well nonindexed values.


In most cases the Optimizer will make the right choice, but in cases where statistics are not up to date, it may be necessary to enforce the right option.
You can do this, by either adding a OPTION (x JOIN) at the end of the query

  OPTION (LOOP JOIN)
  OPTION (MERGE JOIN)
  OPTION (HASH JOIN)


or add the hint in the inner join part of the query

  INNER LOOP JOIN
  INNER MERGE JOIN
  INNER HASH JOIN


Sonntag, 1. Januar 2017

T-SQL: SQL Server Transaction Isolation Levels



There are 4 attributes ensured to a transaction (ACID):

A: Atomicity - In a transaction either all included elements will get committed or none of them
C: Consistency - Data integrity is ensured, meaning if any failure occurs during the transaction, the previous state at begin of the transaction will be kept.
I: Isolation - A transaction in process may not be affected by other transactions.
D: Durability - Committed data is saved by the system.

There are 4 different isolation levels. A sample with SQL code for each type will be provided.

I will use the following Artist table as a basis for my samples:

SELECT [ArtistID]
               ,[Artist]
FROM [Tutorials].[stg].[Artist]SELECT [ArtistID]

ArtistID    Artist
----------- ------------
1           00Agents
2           06 Style
3           10°
4           1000 Ohm
.......

1) READ UNCOMMITTED
This isolation level allows to read changed data from other not closed transactions, which may later be rolled back.

Transaction 1:
SET TRANSACTION ISOLATION LEVEL
READ UNCOMMITTED

SELECT [ArtistID]
              ,[Artist]
FROM [Tutorials].[stg].[Artist]
WHERE Artist = '00Agents'


The query delivers the following result:
ArtistID    Artist
----------- -----------
1           00Agents

(1 row(s) affected)

Now I will run the following transaction, but won't commit the transaction:

Transaction 2:
BEGIN TRANSACTION

INSERT INTO [Tutorials].[stg].[Artist]
VALUES
('00Agents')


If I run the first query again, I will now also see the inserted value from above (not committed transaction)

Transaction 1:
SET TRANSACTION ISOLATION LEVEL
READ UNCOMMITTED

SELECT [ArtistID]
              ,[Artist]
FROM [Tutorials].[stg].[Artist]
WHERE Artist = '00Agents'


ArtistID    Artist
----------- -----------
1           00Agents
38         00Agents

(2 row(s) affected)

Instead of committing I will now rollback the transaction. Now we see the original result again.

----------- -----------
1           00Agents

(1 row(s) affected)

This scenario where uncomitted transactions can be captured, is often referred to as a "Dirty Read".


2) READ COMMITTED
This is the standard default value in SQL Server.
Here we are able to only see committed transactions:

Assuming I'm inserting values in a table, and I open a second transaction where I want to read the table, the following will happen.

Here we insert the values (but do not commit right away for demonstration reasons):

Transaction 1:
BEGIN TRANSACTION

INSERT INTO [Tutorials].[stg].[Artist]
VALUES
('00Agents')



Now we want to select values from that same table:


Transaction 2:
SET TRANSACTION ISOLATION LEVEL
READ COMMITTED

SELECT [ArtistID]
              ,[Artist]
FROM [Tutorials].[stg].[Artist]

WHERE Artist = '00Agents'








Transaction 2 will deliver no result until Transaction 1 is finished.

Once I commit the transaction with the insert values, the select query (Transaction 2) shows results:

Transaction 2:
--BEGIN TRANSACTION

--INSERT INTO [Tutorials].[stg].[Artist]
--VALUES
--('00Agents')

COMMIT TRANSACTION


ArtistID    Artist
----------- ----------
1            00Agents
39          00Agents

(2 row(s) affected)


3) REPEATABLE READ
This isolation level gurantees that data read in a transaction will deliver the same result set later in that transaction. In the sample below I will have a SELECT query, add a waitfor delay and will start a second session in that break that will update values in the underlying table and then after the break run the same SELECT statement in Session 1 to see if the changes happend.

Transaction 1
SET TRANSACTION ISOLATION LEVEL
REPEATABLE READ

BEGIN TRANSACTION

SELECT [ArtistID]
              ,[Artist]

FROM [Tutorials].[stg].[Artist]
WHERE Artist = '00Agents'

WAITFOR DELAY '00:00:05'

SELECT [ArtistID]
              ,[Artist]

FROM [Tutorials].[stg].[Artist]
WHERE Artist = '00Agents'

COMMIT TRANSACTION


 Transaction 2:

BEGIN TRANSACTION

UPDATE [Tutorials].[stg].[Artist]
SET Artist = '01Agents'
WHERE ArtistID = 1

COMMIT TRANSACTION


Although we have an update between the 2 queries, the result is the same.


ArtistID    Artist
----------- ----------
1            00Agents
39          00Agents

(2 row(s) affected)

ArtistID    Artist

----------- ----------

1            00Agents
39          00Agents

(2 row(s) affected)
The reason why the change happens after Transaction 1 is finished, is due to a share lock that is kept until the transaction with Isolation Level REPEATABLE READ is finished.

4) SERIALIZABLE

This  Isolation Level is quite comparable with REPEATABLE READ. The main difference is that Isolation Level SERIALIZABLE wouldn't allow inserted rows. Updates and Deletes are locked with REPEATABLE READS, but inserts are technically possible - but not with Isolation Level SERIALIZABLE. This scenario is also known as a "Phantom Read".