DawaBI Blog | DawaBI - Page 3 of 10

Blog

Colorado SQL Events for April 2014

Date Group Info
Tuesday, April 1, 2014 Denver SQL Lunch
Tuesday, April 15, 2014 Boulder Taking your Skills to the Next Level by Doug Brendle
Tuesday, April 15, 2014 Northern Colorado Evening Meeting

PASS SQLSaturday Events: April 2014

 

Date Event
April 5, 2014 SQLSaturday #296 - Melbourne 2014
April 5, 2014 SQLSaturday #295 - Las Vegas 2014
April 12, 2014 SQLSaturday #283 - Puerto Rico 2014
April 12, 2014 SQLSaturday #297 - Colorado Springs 2014
April 12, 2014 SQLSaturday #267 - Lisbon 2014
April 26, 2014 SQLSaturday #270 - Krasnodar 2014
April 26, 2014 SQLSaturday #284 - Porto Alegre 2014
April 26, 2014 SQLSaturday #282 - Costa Rica 2014
April 26, 2014 SQLSaturday #291 - Chicago 2014
April 26, 2014 SQLSaturday #289 - Orange County, CA 2014
View future events on the SQLSaturday website.
By |March 15th, 2014|PASS, SQLSaturday|

Colorado SQL Events for March 2014

Date Group Info
Tuesday,March 4, 2014 Denver SQL Lunch
Tuesday, March 18, 2014 Boulder Advanced Spatial Analysis: Beyond Bubble Charts by Jason Horner
Monday, March 17, 2014 Northern Colorado
Wednesday, March 19, 2014 Colorado Springs
By |February 26th, 2014|Colorado SQL Meetings|

PASS SQLSaturday Events: March 2014

Date Event March 1, 2014 SQLSaturday #278 – Budapest 2014 March 1, 2014 SQLSaturday #268 – Connecticut 2014 March 6, 2014 SQLSaturday #280 – Vienna 2014 March 15, 2014 SQLSaturday #276 – Silicon Valley 2014 March 22, 2014 SQLSaturday #269 – Exeter 2014 March 22, 2014 SQLSaturday #277 – Richmond 2014 March 29, 2014 SQLSaturday #275 – Copenhagen 2014 March 29, 2014 SQLSaturday #262 – Boston 2014 March 29, 2014 SQLSaturday #287 – Madison 2014 March 29, 2014 SQLSaturday #279 – Phoenix 2014
View future events on the SQLSaturday website.
By |February 18th, 2014|PASS, SQLSaturday|

SQL 2014 Natively Compiled Stored Procedures: Unsupported TSQL by In-Memory OLTP

One of the major new features for Microsoft SQL Server 2014 is the ability to have In-Memory OLTP.  If you have not heard of this, it boils down to tables are able to be completely stored in memory and all operations against that table and it’s data occur in memory.  This gives SQL Server the ability of working with data without the I/O waits that are present in disk based tables.  In versions of SQL Server prior to 2014, stored procedures were strictly an interpreted language.  With the in-memory tables for SQL Server 2014 the ability to have Natively Compiled stored procedures was added.  The advantage this presents is better performance when stored procedures are run against in memory tables.  As with anything there are caveats that go along with it.  Natively Compiled Stored Procedures will not run against disk-based tables and there are limitations as to what TSQL code you can use in them.  Interpreted stored procedures will still run against an in-memory table, but has to do so through a query interop.

The part of the limitations that jumped out at me was the limitations to TSQL.  There is definitely going to be a learning curve because there are some very common TSQL features that are not supported.  The first thing that jumped out was that cursors are not supported! (Yay!)  I know this will cause problems for some people, but I’ve always gone out of my way to find ways around using cursors.  Temp tables will not be supported in the natively compiled stored procedures since they are created on disc.  Natively compiled stored procedures cannot make interactions with tables that are stored on disc.  This will require some work, but can be overcome by using table variables that have a defined table type.  There is also an option of defining a SCHEMA_ONLY in-memory table to use as a temp table.  This means that the table structure is durable, but not the data within it.  When the SQL server is shut down any data within the SCHEMA_ONLY table(s) will be gone but the table structure will be re-created when the server starts.

Here are some of the features that I considered commonly used that will not be supported in a Natively Compiled Stored Procedure.

Feature Alternative
Cursor Set based logic or While Loops
Temp Tables Table variables (not inline) or an in-memory table created as SCHEMA_ONLY
Inline Table Variables Table variables can still be used, but the table type has to be created as a user defined table type.
Common Table Expressions (CTE) Re-writing the query to not use a CTE.
Subqueries (nested queries) Re-write the query to not use a Subquery.
User Defined Functions (UDF) The logic from the UDF will have to be built into the stored procedure.
Execute or Insert Exec Include the logic from the external stored proc
OR and IN and CASE Create queries for each case
NOT This is just a change in syntax.  Use != instead
LIKE  

 

For a full list of unsupported TSQL features in Natively Compiled Stored Procedures go here:  http://msdn.microsoft.com/en-us/library/dn246937(v=sql.120).aspx

Also, here is the white paper on In-Memory OLTP if you want to dig deeper:

http://download.microsoft.com/download/5/F/8/5F8D223F-E08B-41CC-8CE5-95B79908A872/SQL_Server_2014_In-Memory_OLTP_TDM_White_Paper.pdf

By |February 4th, 2014|TSQL|

February – Colorado SQL Events

Please register soon as seating is limited, and let friends and colleagues know about the event. Most events are free, however a nominal fee may be charged so that lunch may be provided (not pizza).

Date Group Info
Tuesday, February 4, 2014 Denver SQL Lunch
Tuesday, February 18, 2014 Boulder SQL Server Performance
Monday, February 17, 2014 Northern Colorado Performance Tuning

12 Avoidances in the initial dashboard design phase

There are many milestones through the lifecycle of a dashboards. The initial stages are the most important, in preventing headaches or failure. This list will hopefully give some guidance to start the ball rolling on a correct path to success.

1)      Avoid waiting for technology waiting for traditional BI projects to begin could take months or even years with delays. Find what technology you have available and start working on some implementations. Once you have started some ground work, the design can easily be applied to other BI technologies with little adjustments needed.

2)      Avoid underestimating time and resources – Though a dashboard is designed to be typically compact, it does not mean it will be quick and easy to create and maintain. Expect it to be an on-going, frequently changing process.

3)      Avoid static/stale data Having data automatically and frequently updateable is very important to the user and designer.  Static or stale data will lead to a false sense of confidence.

4)      Avoid limited access Making dashboards easily accessible is ideal. Making it easy for the user to access the dashboard will allow for more frequent uses and feedback; helping keep the dashboard current and useful. For example, distributing dashboards on the web can help keep data current and still adhere to IT protocols and security standards.

5)      Avoid choosing the wrong metrics Find the metrics that will end up answering the underlying questions.

  1. How does each metric contribute to the objectives?
  2. Is there data that can shed light on the objective?
  3. Is there a meaningful metric that measures those contributions?
  4. Are the metrics, chosen, necessary to contribute to the objective?
  5. Is the metric a continuous organized measurement?

6)      Avoid not knowing the goals and objectives of the dashboard. Some examples of the most common goals found in companies can be generalize as:

  1. Make better-informed business decisions.
  2. Improve customer interaction and satisfaction.
  3. Gain an overall competitive advantage.

The goal(s) usually can be seen having these characteristics at its core:

  1. Objective-focused
  2. Visual
  3. Relevant
  4. Current
  5. Accessible to its audience

7)      Avoid meaningless variety When starting, keep it simple and aim at 3-5 metrics. Never go above 10 metrics. Too many metrics will likely cause added charts and graphs that add no value to the dashboard.

8)      Avoid ignoring the target audience give the audience what they need (e.g. timely summaries, links to supporting details, what actions are applicable, etc..) The usual types of audiences can basically be broken down into 3 categories.

  1. Strategic (Executives) these users want a summary of the overall health, long term progress and goals, historical data.
  2. Operational (Department Heads) these users need a snapshot of the current operation, clear meaning, real time data, interaction and link to information.
  3. Analytic (Engineers, Programmers, Researchers, Lawyers, Paralegals, etc¦) these users need ways to do comparisons, review historical data, evaluate performance, drill into causes, needs feedback, established goals to course-correct.

9)      Avoid using metrics no one understands it is wise to avoid any unnecessary teaching of some new metric. Stick to the common and familiar metrics that the audience will know of and understand.

10)   Avoid using compound metrics Compound metrics are usually developed by using a formula comprised of a variety of measurements to come up with a single value. (Example: FICO, Klout, etc¦) The problem is that compound metrics masks insight and understanding making it hard to compare and contrast the real underlying numbers.

11)   Avoid blindly following easy measurements Easy measurements are great, but can cause users to lose sight of the bigger, more important picture. For example, SAT scores may be a great measurement of what a student’s potential will be at a college, it can also cause a blind spot on the whole life of the student.

12)   Avoid complexity Keep it simple. Focus on trends rather than point-in-time comparisons. Keep focus on the handful of metrics and design principles throughout the dashboard(s). Reduce, minimize, or eliminate non-data elements. Try reducing emphasis by changing the size, color, and borders; meanwhile, removing data that does not add insight. Draw attention and maximize the important data element.

Some helpful resources:

 

By |January 28th, 2014|Visualizations|

A Quick Intro into SQLCLR User Defined Functions

 Initial Setup

For a quick start guide into SQLCLR, the easiest first step will mostly like be creating a user defined function.

To get started, I’m using the following installations:

  • SQL Server Data Tools for Visual Studio 2012 (http://msdn.microsoft.com/en-us/jj650015)
  • Microsoft SQL Server Data Tools Business Intelligence (http://www.microsoft.com/en-us/download/details.aspx?id=36843)
  • SQL Server 2012
  • A database to deploy to, I’ll be using Adventure Works 2012 (http://msftdbprodsamples.codeplex.com/releases/view/93587)

Creating the Project

To get started, open up SQL Server Data Tools for Visual Studio 2012. Select new project and go to the Other Languages in the templates section. Name the project (SQLCLRDemo):

CLRIntro1

Once in the project, in the Solution Explorer, right click on the project and select Add then New Item:

CLRIntro2

There are 5 basic types of SQL CLR objects that can be developed in C# for SQL Server:

  • Aggregate
  • Stored Procedure
  • Trigger
  • User Defined Function
  • User Defined Type

 Creating a User Defined Function

Let’s start with the easiest and most used SQLCLR type, User Defined Function. For this example I’ll create a simple function called IsLeapYear which will accept a year as an integer value and return Boolean:

CLRIntro3

I’m not going to do any special coding for this example, since C#’s DateTime library has a nice slew of usable functions. Here is the user defined function template with the added code:

//------------------------------------------------------------------
// <copyright file="CSSQLFunction.cs" company="Microsoft">
//    Copyright (c) Microsoft Corporation.  All rights reserved.
// </copyright>
//------------------------------------------------------------------
using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;

public partial class UserDefinedFunctions
{
     [Microsoft.SqlServer.Server.SqlFunction]
     public static SqlBoolean IsLeapYear(SqlInt32 year)
     {
        // Put your code here
        return DateTime.IsLeapYear((int)year);
      }
}

As you may have noticed the data types used in SQL Server and C# are different, where SQL Server’s data types are given the prefix of SQL. When using built-in C# libraries, you will need to cast to a C# type.  The return data type, for the function, will need to be a SQL data type (i.e. SqlBoolean).

So build the project, to verify that code is correct. Before deploying, you may want to check and verify the project’s properties. (PROJECT > SQLCLRDemo Properties¦) Here you will be able to set the target SQLServer platform, default schema type, database settings, permission level, etc¦

CLRIntro4

Deploying to SQL Server

To publish right-click the project and select Publish¦.

CLRIntro5

Select the target database and then click publish. Verify in the Data tools Operations pane that everything goes over smoothly:

CLRIntro6

You should then see the new scalar-valued function in the database (you may have to refresh the list):

CLRIntro7

Testing the SQLCLR function

So far so good, let’s test the function:

USE [AdventureWorks2012]
GO

;WITH cteYears AS
(
	SELECT 2013 as iYear
	UNION ALL
	SELECT 2012 as iYear
	UNION ALL
	SELECT 2011 as iYear
	UNION ALL
	SELECT 2010 as iYear
)

SELECT 
iYear,
dbo.IsLeapYear(iYear)
FROM cteYears
GO

Uh, oh. We get an error:

CLRIntro8

Simple enough, just need to enable CLR Integration (http://technet.microsoft.com/en-us/library/ms131048.aspx) just run this script on the targeted database:

sp_configure 'show advanced options', 1;
GO
RECONFIGURE;
GO
sp_configure 'clr enabled', 1;
GO
RECONFIGURE;
GO

Re-run the test script and finally get our results:

CLRIntro9

Conclusion

Creating some SQLCLR functions could make SQL development easier and provide a faster technique versus regular SQL-Server functions.

The other SQL CLR types can get more complicated with higher security needs, potential un-safe code, serializations, and importing dlls to SQL-Server, as some examples. I will try to do some more in-depth SQLCLR tutorials in the future that will run into some of these situations.

 

By |January 21st, 2014|Uncategorized|

Using DDL against a Common Table Expression (CTE)

In my previous life I did support, in both a direct customer support role and most recently a backline type support.  In that backline role I did a lot of data manipulation (updates and deletes) trying to get the customer’s data corrected.  Something was always wrong through their doing or ours.  (Mostly theirs, because we never made mistakes? ;) )  One of the greatest tools I found was the Common Table Expression.  I love CTEs!  They were great for hunting down subsets of data, especially when I had to aggregate data somehow.  Once I had the data I needed I could write updates or deletes directly against that data.

For example, let’s pretend that person data was imported into a table.  Someone realized after the import was complete that they imported the same phone numbers for multiple people and so those phone numbers need updated.  They don’t know what phone numbers are duplicated for what people, so they need a list of all the people in the table with the same phone number.  In this list they need FirstName, LastName and PhoneNumber.  (To keep it simple everyone has one phone number, pretend it’s 1980)  Because adding in the first and last name to the query would create a unique value (since you have to group by everything in the select list) only the phone number can be used at first.  This is a simple query, and I think one of the first SQL queries I ever learned.

SELECT PhoneNumber, COUNT(*) 
FROM Person 
GROUP BY PhoneNumber 
HAVING COUNT(*) > 1

With this query we will have a list of all the phone numbers in the table that are in the table more than once.  But we don’t have the FirstName or LastName.  There are a number of ways to do this.  Once I started using CTEs they became my preferred method because it’s so simple to put the query in parenthesis and add the with’ statement.  Here is the resulting query that will return all of the data requested.  You will see the original query is just wrapped into the CTE and a new query added to get the additional information.

;WITH cte_Duplicates
	 AS ( SELECT PhoneNumber
				 , Count(*) [Count]
		  FROM   Person
		  GROUP  BY PhoneNumber
		  HAVING Count(*) > 1 )
SELECT FirstName
	   , LastName
	   , PhoneNumber
FROM   Person xPer
JOIN   cte_Duplicates xDupe ON xPer.PhoneNumber = xDupe.PhoneNumber

A few notes about the CTE and the syntax if you are not familiar with the SQL Server documentation on CTE’s.  The semi-colon before the with’ is required if there is any other SQL statements running before the CTE.  I found it’s just easier to remember to put it in there no matter what.  The documentation on CTE’s show a column list after the name of the cte.  Ex. ;with cte_Duplicates (PhoneNumber,Count) The column list is optional.  The requirement with column names is that they are not duplicated within the CTE and that every column has a name.  If you have duplicates you can either alias a column within the SELECT statement or put in a column name list.  The same goes for columns with no names like the COUNT in the query.  You will notice in the CTE I put in an alias of [Count] so that it has a name and I put the name in brackets since it’s a reserved word.  I could have left the count out of the CTE since the final query doesn’t display it, but it’s nice to have it in there if you want to highlight just the select statement within the CTE and run it to look at the original data.

That list went to the person lucky enough to have to sort out what phone numbers were right and wrong.  They gave us back the list with what the phone numbers should be.  Assuming that the list was worked with in Excel I would take that list and convert it to a CSV.  I had a macro written in Notepad++ that would wrap every line in parentheses and put a comma on the end.  Then I could take that data and put it into an insert for a table variable or a temp table depending on size.  Once the data is staged somehow the person data can be fixed using the same CTE from above.  To do this we’ll just wrap the second query into a second CTE and then write the update.  For this example assume there is code above the CTE’s that declares a table variable of @Corrections and inserts the correct data.

;WITH cte_Duplicates
	 AS ( SELECT PhoneNumber
				 , Count(*) [Count]
		  FROM   Person
		  GROUP  BY PhoneNumber
		  HAVING Count(*) > 1 ),
	 cte_List
	 AS ( SELECT FirstName
				 , LastName
				 , PhoneNumber
		  FROM   Person xPer
		  JOIN   cte_Duplicates xDupe ON xPer.PhoneNumber = xDupe.PhoneNumber )
UPDATE cte_List
SET    PhoneNumber = xNew.PhoneNumber
FROM   cte_List xList
	   JOIN @Corrections xNew ON xList.FirstName = xNew.FirstName

This statement will update the underlying table (Person) with the correct phone number from the @Corrections table variable.  The second select got wrapped into a CTE, which is as easy as putting in a comma, adding a new alias and putting in the parentheses.  Then I wrote the update against the second CTE.  (The first CTE cannot be updated because it aggregates data)

There are some caveats to using CTEs with DDL, as there is with everything in SQL Server.  You can only delete if there is one table in the select statement in the CTE.  That limits the capabilities a little.  Hopefully, you are updating data more than you are deleting data, because deleting data is never good.  You cannot update multiple tables at one time.  If your CTE has multiple tables in the select you can only write updates back to columns in one of the tables.  For example, let’s pretend that a group of orders were entered in not accounting for a leap year.  So the order date needs updating.  The first thing we do is write that select to get the data.  Instead of re-writing everything for the update just wrap it into a CTE and then write the update.  It is only updating data from the Sales.Orders table so it will run successfully.

;WITH cte_Updates
	 AS ( SELECT xPer.FirstName + ' ' + xPer.LastName AS Name
				 , xOrd.OrderID
				 , xOrd.OrderDate
		  FROM   Person.Person xPer
		  JOIN   Sales.Orders xOrd ON xPer.PersonID = xOrd.PersonID
		  WHERE  xOrd.OrderDate >= '2/28/2012' )
UPDATE cte_Updates
SET    OrderDate = Dateadd(dd, 1, OrderDate)

These are really simple examples.  In the real world there would be a PersonID of some kind that could be used for a much simpler update for the first example.  (I would hope)  The CTE method really shines when you have multiple queries to get to the data that needs to be updated or deleted.  The single most compelling reason for updating/deleting via a CTE in my opinion is this:

Updating/deleting data directly through the CTE gives you a peace of mind because you can run the selects to see exactly what data you are inflicting change upon.  There is no guess work on what data you touch, if there is just re-run the select statements in the CTE to view the data.

That is the key.  Instead of writing select statements and finding the data.  Then trying to translate those select statements into an update.  You can take the select statements and wrap them up in CTE(s) and then update the CTE.  SO SIMPLE!

By |January 17th, 2014|Uncategorized|

PASS SQLSaturday Events: February 2014

 

Date Event

February 8, 2014

SQLSaturday #241 – Cleveland 2014

February 22, 2014

SQLSaturday #273 – Tampa 2014

View future events on the SQLSaturday website.

By |January 15th, 2014|PASS, SQLSaturday|