Arquivo

Archive for the ‘Tunning Queries – Optimize’ Category

Sometimes temp tables can be your best friend

Link:http://sqlserverpedia.com/blog/sql-server-bloggers/sometimes-temp-tables-can-be-your-best-friend/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+sqlserverpedia+%28SQLServerPedia%29

Here is a quick example of when using temporary tables will give you a big performance gain.  I recently was given this query to improve, as it was taking over 15 hours to run:

SELECT *
FROM
(
SELECT * FROM
GL_Star_FactsDetails GL
LEFT JOIN DimAccount ACCT on GL.Account = ACCT.AccountRapidDecision
where GL.LedgerTypeCode = ‘AA’
) FD
INNER JOIN vwDimBusinessUnit BU ON FD.BusinessUnitCode = BU.BusinessUnitCode
INNER JOIN DimCurrencyExchangeRates EX ON BU.CurrencyCode = EX.CurrencyCode AND FD.DateKey = EX.DateKey

GL_Star_FactsDetails has 16 million records, DimAccount has 5,000 records, DimCurrencyExchangeRates has 70,000 records, and vwDimBusinessUnit is a complex view (it has a UNION ALL and multiple joins) that returns 10,000 records.

These are not big tables, so why is this query taking so long?  A simple look at the query execution plan tells the story: it has 80 operations!  It is simply a matter of the query optimizer having too many tables and too many joins to come up with an efficient query plan.  So, use temp tables to simplify it for the query optimizer:

–substitute a temp table for the view
select *
into #TempDimBusinessUnit
from vwDimBusinessUnit

–substitute a temp table for the sub-query
SELECT *
into #TempFactDetails
FROM
GL_Star_FactsDetails GL
LEFT JOIN DimAccount ACCT on GL.Account = ACCT.AccountRapidDecision
where GL.LedgerTypeCode = ‘AA’

–and use the two temp tables in the main query
SELECT *
into #temp
FROM
#TempFactDetails FD
INNER JOIN #TempDimBusinessUnit BU ON FD.BusinessUnitCode = BU.BusinessUnitCode
INNER JOIN DimCurrencyExchangeRates EX ON BU.CurrencyCode = EX.CurrencyCode AND FD.DateKey = EX.DateKey

select * from #temp

This rewrite resulted in a much simpler query plan with the query taking only 12 minutes.  While there are some negatives with using temp tables, in this case the benefits make it very worthwhile.

Performance Advantages of SQL Server Filtered Statistics

Performance Advantages of SQL Server Filtered Statistics

Related Tips: More

Problem

When running queries SQL Server may not always use the correct statistics to determine the best optimization for a query plan.  This could be a performance hit for complex queries and have correct statistics is key to the optimizer.  In this tip we look at how filtered statistics can improve query optimization.

Solution

One of the new features that was introduced with SQL 2008 is called Filtered Statistics. This is the definition from BOL “Filtered statistics can improve query performance for queries that select from well-defined subsets of data. Filtered statistics use a filter predicate in the WHERE clause to select the subset of data that is included in the statistics.”

Let us show how we can use filtered statistics with an example.  The below code creates a table and inserts some data into the table. The code for this tip was adopted with permission from an article by Jack Li.

CREATE TABLE MyRegionTable(id INT, Location NVARCHAR(100), USState CHAR(2))
GO
CREATE TABLE MySalesTable(id INT, detail INT, quantity INT)
GO
CREATE CLUSTERED INDEX IDX_d1 ON MyRegionTable(id)
GO
CREATE INDEX IDX_MyRegionTable_name ON MyRegionTable(Location)
GO
CREATE STATISTICS IDX_MyRegionTable_id_name ON MyRegionTable(id, Location)
GO
CREATE CLUSTERED INDEX IDX_MySalesTable_id_detail ON MySalesTable(id, detail)
GO
INSERT MyRegionTable VALUES(0, 'Atlanta', 'GA')
INSERT MyRegionTable VALUES(1, 'San Francisco', 'CA')
GO
SET NOCOUNT ON
-- MySalesTable will contain 1 row for Atlanta and 1000 rows for San Francisco
INSERT MySalesTable VALUES(0, 0, 50)
DECLARE @i INT
SET @i = 1
WHILE @i <= 1000 BEGIN
INSERT MySalesTable  VALUES (1, @i, @i*3)
SET @i = @i + 1
END
GO
UPDATE STATISTICS MyRegionTable WITH fullscan
UPDATE STATISTICS MySalesTable WITH fullscan
GO

Before we run the following queries turn on “Include Actual Execution Plan” and run the following query for Atlanta.

-- this query will over estimate
-- its estimate will be 500.5 rows, but should be 1 row
SELECT detail FROM MyRegionTable JOIN MySalesTable ON MyRegionTable.id = MySalesTable.id
WHERE Location='Atlanta' OPTION (recompile)

Even if we have only 1 row containing the Atlanta location, the optimizer over estimates and uses a value of 500.5 as shown below.

Now, let’s run the following query for San Francisco.

-- this query will under estimate
-- its estimate is also 500.5 rows, but should be 1000 rows
SELECT detail FROM MyRegionTable JOIN MySalesTable ON MyRegionTable.id = MySalesTable.id
WHERE Location='San Francisco' OPTION (recompile)

In this case, the optimizer, under estimates the number of rows. It is interesting to note that we only have two values for the locations and 1001 rows in total and in both queries the optimizer estimate is 500.5 rows.

Let’s create our filtered statistics using the below code.

CREATE STATISTICS MyRegionTable_stats_id ON MyRegionTable (id)
WHERE Location = 'Atlanta'
GO
CREATE STATISTICS  MyRegionTable_stats_id2 ON MyRegionTable (id)
WHERE Location = 'San Francisco'
GO

The image below shows that the filtered statistics have been created.

Let’s run the same queries one more time.

SELECT detail FROM MyRegionTable JOIN MySalesTable ON MyRegionTable.id = MySalesTable.id
WHERE Location='Atlanta' OPTION (recompile)

We can now see that the estimated number of rows returned is now correct.

Let’s run the same query again for San Franciso.

SELECT detail FROM MyRegionTable JOIN MySalesTable ON MyRegionTable.id = MySalesTable.id
WHERE Location='San Francisco' OPTION (recompile)

And it is also correct for this query as well.