How to assert that your SQL does not do full table scans

Thursday, March 25, 2010 3:18:00 PM (Romance Standard Time, UTC+01:00)

In database applications it is often important to ensure that your SQL uses indexes to avoid performance problems. Consider an SQL select statement like this

SELECT text FROM response WHERE questionId = 27 AND participantId = 38

If there is no index on response(questionId, participantId), the database needs to do a full table scan, and if the response table have a lot of rows, this will degrade performance. Such a performance problem might only be observed during testing on large amounts of data, which is not typically something a developer will do before every commit.

It would be useful to be able to detect this kind of problem much earlier, preferably during unit testing. Here I will describe one way to do this using Microsoft SQL Server and .NET:

SQL Server makes a lot of information accessible through performance counters, and one of those performance counters is incremented every time SQL Server does a full table scan. Utilizing that information, we can write a unit test that verifies that indexes are properly in place by asserting that the number of full table scans performed by a piece of code is equal to zero:

using(new AssertSqlFullScanCount(0))
{

var
responses = Response.GetResponses(q, p);
Assert.That(responses, ...);
}

This code uses the following helper class:

public class AssertSqlFullScanCount : IDisposable
{
private
int expected;
private long actual;
private PerformanceCounter fullScans;
public AssertSqlFullScanCount(int expected)
{
this.expected = expected;
this.fullScans = new PerformanceCounter(
"MSSQL$SQLEXPRESS:Access Methods",
"Full Scans/sec");
this.actual = fullScans.RawValue;
}

public
void Dispose()
{
actual = fullScans.RawValue - actual;

Assert.That(actual, Is.EqualTo(expected));
}
}

Do you consider this kind of performance unit testing useful? How would you implement this kind of tests with Oracle

By Lars Thorup

Profile your .NET code with NProf

Sunday, August 05, 2007 10:58:04 AM (Romance Standard Time, UTC+01:00)
When writing code I often spot some optimization opportunities, which I typically leave for later, just writing a small comment:
    // ToDo: Optimize this
I believe it is important to first get the behaviour right, and only consider detailed optmizations later.

Then one day the "optimization"-task becomes the top priority on my list, and then I am tempted to start by looking through all my recorded optimization opportunities to try and figure out which one will give me most performance for the least amount work. But then I think twice and instead look around for my favorite profiling tool. Because all the many times I have done this optimization exercise in the past the profiler have always found that the best optimization opportunity was one that I never even considered. So typically I end up never implementing any of the small optimization opportunities I have marked for myself. Maybe I should stop comforting my self by writing them, and just skip them...

I have used several commercial profiling tools in the past, and generally enjoy their ease of use: they can typically profile many different types of applications (web, libraries, command line) and they also typically have a "hot spot"-feature that immediately show you the way to interesting optimization opportunities. On Java I have been using OptimizeIt from Borland, and on the .NET platform I have used dotTrace from JetBrains.

However I recently discovered the open source profiler NProf. And while this tool does not have all the usability features of the commercial alternatives, the core profiling functionality is good enough for its inclusion in an open source tool chain.

NProf can only profile an executable, and I was using it for a web project. But fortunately I have a lot of tests, both unit tests and integration tests. And it didn't take long to write a small executable wrapper around one of my thorough integration tests and then firing up NProf on the resulting executable. And what did I see: that the most important optimization opportunity would be caching of configuration settings, cutting around 33% off the main call sequence of my application, something that had never occured to me...

To site Lord Kelvin: If you can't measure it, you can't improve it.

By Lars Thorup