more sql performance tips

As most of you know the general rule in performance tuning is …..

“measure, measure, measure” and “measure again”

But it doesn’t hurt to give out some warning signs from time to time. And here goes two: [more]

1) be careful in doing anything fancy on JOIN predicates (that is the condition in the “ON” section)

   most of the time anything more complex than comparing one columns from each of the tables being joined will have some performance impact.

   again measure, but in experience I find that even creating a common table expression, or a subquery to come up with the derived column and use it to join is faster than doing it in the JOIN predicate.

   also, the less columns involved in a JOIN the better.

2) always prefer set operations than row-by-row operations

    for the same reason that cursors should be avoided

    what many don’t know however is that if you have a UDF (user-defined function) used in a SELECT query result set then that function will be called for every row and thus there will be an overheed. Again measure but for performance sensitive cases it will be significant.

     so if you have to squeeze out any performance gain you can have then get rid of the function if you and although redundant have the logic on the query itself.

     maintainability wise this is not advisable. so you have to feel the trade-off here

     For example:

     SELECT fullname = dbo.GetFullName(FirstName, LastName) FROM Employee (NOLOCK)

     regardless of the complexity of the GetFullName function’s implementation would be slower than having it handled in the query directly. say

     SELECT fullname = FirstName + ‘ ‘ + LastName FROM Employee (NOLOCK)

Hope to share more soon. But again, don’t forget, measure3x

SQL Business Intelligence Developer Needed (Manila, Philippines)

We are currently looking for a SQL BI Developer Professional to work with us on exciting, high-profile/scale projects. Feel free to contact me or visit http://www.lwsmedia.com/contact.htm. Looking forward to work with you. [more]

Company Profile:

Agilone LLC
Norwalk, CT USA, Los Gatos CA USA , Istanbul, Turkey  and Manila, Philippines

Companies who can effectively understand, process and take value from their data gain a sizable competitive advantage.  However, many fail to do so since the amount of data captured by organizations is growing more quickly than the capabilities of the tools to analyze it.  Agilone solves this problem with proprietary, SAAS-based analytical tools that help them make data-driven marketing decisions that drive superior results.
 
Agilone’s goal is to help clients develop and execute data-driven marketing strategies. We provide clients with advanced technology and analytical marketing services to help identify and execute opportunities hidden in their data.  Our approach is effective because we analyze more of their data than the competition and offer a customized solution.
 
Typical engagements are in areas of customer valuation, pricing, response modeling, segmentation, with follow-on implementation of technologies such as data warehousing, web-based application development, database management and business intelligence services.
 
We are headquartered in Norwalk, CT with a technology development and services center in Istanbul, Turkey.  Recently, we have opened our new office in Silicon Valley (Los Gatos) California.

We are a high growth, entrepreneurial company and are always looking for intelligent hardworking people to join our company.  Transfers between Istanbul office and US offices are possible and we do sponsor work visas and Greencards in US depending on tenure and performance of employees.
 
Job Description:
This position is for our Manila Office.   Due to the increase in our business volume we need additional SQL developers with the following background and skill set:
 
Position Responsibilities:
– Be part of the developer team from Analysis to design, programming, testing and deployment
– Design and administer project related databases
– Create technical documentation
– Perform unit test of the codes
– Author user manuals and installation guides
 
Education:
Minimum: BS, BA or equivalent, very good command of written and spoken English
 
Required Skills and Experience:
1. Excellent communication skills both written and verbal
2. Competent in T-SQL, working knowledge of MS SQL 2005 and MS SQL 2008.
3. Experience in Dimensional Databases and OLAP Cubes.
4. Desire to learn new platforms and environments as the projects leads.
5. Strong analytical and problem solving skills with attention to detail.
6. Self-motivated – comfortable working in a fast paced environment with limited direction.
7. Ability to multi-task and work on for several different projects.
8. Holding a Microsoft certification is a plus.
 
Process:
We will conduct 2 tests before an interview.  The first test would be a general aptitude test, after you pass this test you will be given a more specific T-SQL test.  After these tests you will be invited for an interview with the Director of Business Intelligence and the Principal of the company.

24 hours of high quality sql server training for free

The Professional Association for SQL Server or popularly known aka PASS, one, if not the most popular and biggest group of SQL Server professional in world will be having 24 hours of PASS – a series of online web meetings with the best speakers in the industry [more]

Join us online on September 2, 2009* for 24 hours of free, nonstop SQL Server training

Join us for 24 Hours of PASS, a FREE community event featuring some
of the top SQL Server and Business Intelligence (BI) speakers in the
industry. Over the course of one 24-hour period—directly from your
computer—our experts will:

  • Show you how to take better advantage of the SQL Server capabilities already at your fingertips
  • Give you an in-depth look at the hottest SQL Server and BI topics,
    including SQLCLR performance, relational database design, performance
    tuning and troubleshooting, SQL Server 2008 security, embedding
    Reporting Services in your apps, SSIS tips and tricks, working with
    spatial data, text mining, data warehousing foundations, and much
    more… 
  • Take your questions directly and answer them live!

Check out all the great SQL Server sessions you can attend.

24 Hours of PASS is a series of live one-hour webcasts broadcast
over a 24-hour period that focus on a host of SQL Server topics. The 24
one-hour presentations will begin at 00:00 GMT (UTC) on September 2,
2009.*

You can view all details from the event's official site : http://24hours.sqlpass.org/

If only I could attend all of them in 24 hrs I would 🙂 Hopefully they would be able to make recordings this available. I will be trying to attend the Business Intelligence related ones so if we do make it, see you there. "virtually"

TSQL to get SQL Server properties

TSQL script to get common SQL Server properties (edition, instancename et al) [more]

SELECT
  SERVERPROPERTY('Collation') Collation,
  SERVERPROPERTY('Edition') Edition,
  SERVERPROPERTY('Engine Edition') EngineEdition,
  SERVERPROPERTY('InstanceName') InstanceName,
  SERVERPROPERTY('IsClustered') IsClustered,
  SERVERPROPERTY('IsFullTextInstalled') IsFullTextInstalled,
  SERVERPROPERTY('IsIntegratedSecurityOnly') IsIntegratedSecurityOnly,
  SERVERPROPERTY('IsSingleUser') IsSingleUser,
  SERVERPROPERTY('IsSyncWithBackup') IsSyncWithBackup,
  SERVERPROPERTY('LicenseType') LicenseType,
  SERVERPROPERTY('MachineName') MachineName,
  SERVERPROPERTY('NumLicenses') NumLicenses,
  SERVERPROPERTY('ProcessID') ProcessID,
  SERVERPROPERTY('ProductVersion') ProductVersion,
  SERVERPROPERTY('ProductLevel') ProductLevel,
  SERVERPROPERTY('ServerName') ServerName

Taken from: http://sqlserver-qa.net/blogs/t-sql/archive/2007/12/21/2994.aspx

caution in dropping a temp table before creating it

Recently I ran into a script instead a stored procedure

IF OBJECT_ID(tempdb..#temp1) DROP TABLE #temp1

Basically, the object of this script is to check if #temp1 (regular temporary table) exists. If so drop it.
However, I think it can have unintended consequences and maybe safer not to include. [more]

Say you have a script that includes the call to the stored procedure (eg. SampleStoredProc)
If the script (let’s call this “caller”) creates a table #temp1 and at the top of SampleStoredProc you have if object_id(tempdb..#temp1) drop table #temp1, what will happen is that the #temp1 table of the “caller” will be dropped.
And the caller might not want that (or won’t expect that the #temp1 table he/she created will be dropped). It is possible that after calling SampleStoredProc the caller would still want to use/access #temp1.
On the other hand if no drop table #temp1 is executed inside SampleStoredProc and a create table #temp1 is made even if the “caller” has a #temp1 already it will not be a problem. The #temp1 of the caller and #temp1 of BehavClusDOM1 will be separately identified.
Since BehavClusDOM1 is a stored procedure it is a scope for temp tables and safe to assume that at the start of the stored procedure no temp tables are present in that scope.
Basically the idea is that BehavClus should not touch whatever is beyond its scope.

So in my opinion the inclusion of this code can cause unexpected behavior to the caller while removing it poses no risk not to mention shortening the code and decreasing complexity and readability. The author of the stored procedure (eg. SampleStoredProc) involving creation of #temp1 should know when it is present and shouldn't worry about clashing with another #temp1 in another session. You can explicitly DROP TABLE #temp1 if you want but only after you have created your own #temp1 so your sure that you'd be dropping the one you created and not that in other scripts.

temp table (#), global temp tables (##) and @table variables

I've been working "full-time" on TSQL scripts for the past month (no with .NET windows/web apps) and mostly on optimization. And I feel that I should share with everyone this article about temp tables and table variables and some of my own notes. Go read the article below then you may come back here. Take careful note of the Conclusion part at the end of the article. [more]

Should I use a #temp table or a @table variable?

Things to generally consider are speed, disk space and cpu utilization, persistence. And here are some short hopefully helpful notes.

1) persistence – if you need the data to persist even after the execution then no doubt you need to use permanent tables.

2) cpu utilization – always review indexes. effect of the use of regular temp, global temp tables, table variables to performance is not as significant as a missing index. Among other things, it is safe to say, always ensure you have a primary key. and also if you will perform queries with ORDER BY then always consider your clustered indexes are correct. you can use estimated or actual execution plans to analyze your queries. I recommend SQL profiler too but correct me if I'm wrong but they are only able to profile permanent tables.

3) speed – first thing, indexes again too (see above). Secondly and very important – Although table variables may seem (and actually common) faster than temp tables, my observation is that if you are dealing with large datasets then temp tables are way way faster than table variables. I could not quantify giving the dataset I'm working on but suffice to say that it took quite a number of folds faster that I cancelled the execution.

4) disk space – temp tables, global temp tables and table variables take up the same space as permanent tables would. But they should be cleared once the procedure/function goes out of scope. The tempdb transaction log is less impacted than with #temp tables;
table variable log activity is truncated immediately, while #temp table
log activity persists until the log hits a checkpoint, is manually
truncated, or when the server restarts

Convert Delimited Values to a table with Ordinal Column

This is a script to conver a delimited/separated values given a delimiter into a table, via a User Defined Function.[more]

This has saved me a lot of work already including just now so I gfigured this is a good share

My apologies as I could not determine where I based this script from but definitely I got it from someone else rather than created my own. Had a few revisions along the way.

CREATE FUNCTION [dbo].[Split]
(
  @String VARCHAR(200),
  @Delimiter VARCHAR(5)
)
RETURNS @SplittedValues TABLE
(
  Ordinal SMALLINT IDENTITY(1,1) PRIMARY KEY,
  SplitValue VARCHAR(MAX)
)
AS
BEGIN
  DECLARE @SplitLength INT
  WHILE LEN(@String) > 0
  BEGIN
    SELECT @SplitLength =
      (CASE CHARINDEX(@Delimiter,@String)
        WHEN 0 THEN LEN(@String)
        ELSE CHARINDEX(@Delimiter,@String) -1
      END)
    
    INSERT INTO @SplittedValues
      SELECT SUBSTRING(@String,1,@SplitLength)
      SELECT @String =
        (CASE (LEN(@String) – @SplitLength)
          WHEN 0 THEN ''
          ELSE RIGHT(@String, LEN(@String) – @SplitLength – 1)
        END)
  END
  RETURN
END
GO

Performance, Measure and ANTS Profiler

Might need to create a separate page for notes on performance since I've been doing a lot of C# and database tuning lately but having them on this post so far. Here are some of my notes on performance [more]

* Before you optimize, ensure that your results are accurate first before optimizing. I would suggest Test Driven Development or at least some unit testing but that's another story. Just make sure your results are correct first otherwise your optimization is useless.

* Before you optimize, always MEASURE MEASURE MEASURE

* If you can measure without introducing extra code, go with it. Aside from saving time, it will keep your code clean and introduce what could be unnecessary complexity. And for this very reason I just purchase ANTS Profiler 4 from RedGate software. Another alternative is dotNetTrace from JetBrains and although I love Resharper from the same company I prefer ANTS profiler. I said prefer because as I haven't explored dotNetTrace that much, i would say it is really preference, UI/usability and being able to view the timings embedded in a window which shows the source code. Also like the call graph, drill down on significant methods in terms of time spent on it. I got a quote for a no support/update version of the software when I emailed RedGate about their pricing which is admittedly very high for a personal purchase when you are earning from a 3rd world country. So will be receiving fixes but not major upgrades (so as they say but haven't dug deep – besides the version looks pretty good and turns out to be very useful for me already). It really sucks to try to optimize something only to find out that it is not the bottleneck

* If your application involves database access and you're slow then more often than not your database is not tuned. And SQL Profiler and Database Tuning Advisor is a very good start. You can apply the recommendations or you can just evaluate them and make your own adjustments. In my experience however the recommended changes does make a lot of performance improvement. You think you know enough about indexes and database design? You might be surprised how much performance you can gain from these tools.

Again measure, measure, measure but if you can only identify some specific parts then data access seems like a good start.

* Then look into you application logic. Even if your code is optimized but your algorithm/logic is wrong, it will still turn out bad

* Also know about database/table statistics, indexed views and partition. If you can take advantage of partitions and involving considerable data access the improvement is quite impressive.

* <string>.SubString(…) does some considerable lifting so if you want to check if the first two chars in a string is equal to some other string then you should consider using <string>.StartsWith instead or avoid the SubString if you can

* Hashtable is faster than SortedDictionary, SortedList or List. I know I should provide stats for this (have a URL somewhere and will update this soon)

* When using nullable types, use <variable>.HasValue as much as possible than comparing using != null. Or even "!<var>.HasValue" vs. "== null". But avoid negation if you have to

Finally this list will grow soon hopefully and if feel like you disagree please feel free to comment and have other benefit from your thoughts too 🙂