Skip(0) in LINQ and Testing

Posted by on in Blogs
A couple of weeks ago, I wrote In LINQ, Beware of Skip(0). In that post, I observed that calling Skip(0) on a query result in LINQ, which has no effect on the dataset returned, imposes a performance penalty with at least some LINQ providers. At the time I commented that there might be some desirable behavior of this that I had missed. Sure enough, one of the developers on the LINQ to SQL team noted in comments that Skip(0) will cease to be a no-op in LINQ to SQL in .NET 4.0, and supplied a perfectly reasonable explanation for the change.

Given that calling Skip(0) introduces a performance penalty, but there is a reason for introducing that penalty, should we avoid making this call? The short answer is yes, the performance improvement is worth it, but don't forget the rule that you should have one test case for every conditional in your code. This rule is generally applied to unit testing, but in this case the appropriate test is an integration test, since you need to be sure that the LINQ provider will perform acceptably and not fail altogether when you fetch the data for the second page, which is now an entirely different query than that used to retrieve the data for the first page.

When I first observed the issue, I was happy to discover that I could make the query which fetches data for the first page of a result set (which happens to be the page which users most commonly request) 10 times faster. Of course, the other way to look at this is that the query that fetches data for the second and all subsequent pages is 10 times slower than the query which fetches data for the first page. Again, given that the first page is displayed much more frequently, that's not necessarily a bad thing, as long as it doesn't catch you by surprise. Indeed, calling Skip(n), where n > 0, and getting your results not quite as quickly as if you had not called Skip(n) at all is actually the best case; the worst possible case is that the LINQ provider will generate invalid SQL and the query will fail.

Note that it is especially difficult to ensure complete integration testing coverage as you introduce more and more ways that the user can shape a result set. If the user can change the sort order, enter a variety of search queries, etc., it's a safe bet that every combination of these options will produce a different query with different performance characteristics. Avoiding calling Skip(0), in other words, does not increase the number of integration test cases for a particular list of data by one; it doubles it.


Comments

Check out more tips and tricks in this development video: