I’ve always had issues writing a TSQL Pivot statement without going back to books-on-line to figure it out.  Maybe because I haven’t done it enough, but it always seems a bit convoluted.  That meant it was time to write about it and break it down so I could make sense of it.  Let’s set up the demo data first.  Based on the data I was working on pivoting I wrote up some simple demo data.  It has a name, objective and a result for the said objective.  It has one record per objective and the pivot should line up those objectives so there is one row per name instead of one row per objective.

DECLARE @BaseData table (  

Name   varchar(10)  

,Objective  varchar(5)  

,Result   varchar(15)  )

 

INSERT @BaseData

SELECT ‘Bob’,’Obj1′,’Complete’

UNION

SELECT ‘Bob’,’Obj2′,’Pending’

UNION

SELECT ‘Lisa’,’Obj1′,’Pending’

UNION

SELECT ‘Lisa’,’Obj2′,’Pending’

UNION

SELECT ‘Don’,’Obj1′,’Complete’

UNION

SELECT ‘Seth’,’Obj1′,’Complete’

UNION

SELECT ‘Seth’,’Obj2′,’Complete’

UNION

SELECT ‘Megan’,’Obj1′,’Pending’

UNION

SELECT ‘Megan’,’Obj2′,’Complete’

 

 

SELECT * FROM @BaseData

 

 Now to work on understanding the pivot.  Here’s the finished Pivot statement for reference: 

PIVOT(MAX(Result) for Objective in ([OBJ1],[OBJ2])) as ResultPivot

 The [Result] column is the data that needs to be pivoted.  The first stumbling part of the pivot is that the data that is being pivoted has to be wrapped in some kind of accumulation.  So if you are dealing with numerical data it can be a MIN, MAX, AVG, etc.  With string data, which is what I dealt with the easiest is MIN or MAX.  Since there is only one result per objective in this data, the MAX will be that value.

The next part of the pivot is naming the columns that the [Result] data will be pivoted into.  I associated this to telling the PIVOT command what the key was to the data to be pivoted.  In this case there are consistent values in the Objective column of Obj1 and Obj2.  Not all Names in the data have an Obj1 and Obj2 which is fine as you will see.  If you don’t have a consistent value like this I could see using a derived column like a ROWNUMBER() or something similar to give the data values to create columns on.

After naming the columns that the pivoted data will go into the entire statement needs an alias.  It won’t necessarily be used in the SELECT statement but the PIVOT statement will throw errors if it is not aliased.

Here is how the PIVOT is worked into the query. 

SELECT Name, OBJ1 as Objective1, OBJ2 as Objective2                FROM @BaseData

                PIVOT(MAX(Result) for Objective in ([OBJ1],[OBJ2])) as ResultPivot

 Notice in the SELECT statement the columns from the pivot get the names from the PIVOT statement.  I did this because I wanted to alias the column names to be more descriptive.  If you get rid of the column names and just put in SELECT it will work too.  If you put the [RESULT] column in the columns to SELECT the statement will error and throw Invalid Column Name.’  The [RESULT] column basically no longer exists when the data is pivoted.  When the query is run (with the table variable above to create the data) the results will look like this:

Name Objective1 Objective2
Bob Complete Pending
Don Complete NULL
Lisa Pending Pending
Megan Pending Complete
Seth Complete Complete

 If you look at Don’s record you will see that Objective 2 is NULL.  That objective does not exist for him.  If there are hundreds or thousands of records that are being pivoted and a flag needs to be created if there are missing objective records it’s simple.  Those objective columns can be used in the query just as normal columns.  In the query below you can see how a CASE statement was added to flag those people with missing records.

SELECT Name, OBJ1 as Objective1, OBJ2 as Objective2                ,CASE

                                WHEN OBJ1 is NULL THEN ‘TRUE’

                                WHEN OBJ2 is NULL THEN ‘TRUE’

                                ELSE ”

                                END as ErrorFlag

                FROM @BaseData

                PIVOT(MAX(Result) for Objective in ([OBJ1],[OBJ2])) as ResultPivot

 The results of this query will have an additional column called ErrorFlag that will have a value of TRUE if there is a NULL in Objective 1 or Objective 2.  This will allow for analysis of the data to easily point out those records.

Name Objective1 Objective2 ErrorFlag
Bob Complete Pending  
Don Complete NULL TRUE
Lisa Pending Pending  
Megan Pending Complete  
Seth Complete Complete  

 Another way of making the NULL data stand out or if NULL data doesn’t matter at all the OBJ1 and OBJ2 could be wrapped in an ISNULL.  That could be used to put in an empty string or some value to indicate problems with the data.

Hopefully, this helps out.  I’m sure the next time I write a PIVOT I’ll have to dig out this posting and skim through it real quick to remember the details.