Jasper22.NET: Optimization and generics, part 1: the new() constraint

Optimization and generics, part 1: the new() constraint

Posted by jasper22 at 17:07 | Tuesday, August 23, 2011

As with almost any performance work, your mileage may vary (in particular the 64-bit JIT may work differently) and you almost certainly shouldn't care. Relatively few people write production code which is worth micro-optimizing. Please don't take this post as an invitation to make code more complicated for the sake of irrelevant and possibly mythical performance changes.

I've been doing quite a bit of work on Noda Time recently - and have started getting my head round all the work that James Keesey has put into the parsing/formatting. I've been reworking it so that we can do everything without throwing any exceptions, and also to work on the idea of parsing a pattern once and building a sequence of actions for both formatting and parsing from the action. To format or parse a value, we then just need to apply the actions in turn.
Simples.

Given that this is all in the name of performance (and I consider Noda Time to be worth optimizing pretty hard) I was pretty cross when I ran a complete revamp through the little benchmarking tool we use, and found that my rework had made everything much slower. Even parsing a value after parsing the pattern was slower than parsing both the value and the pattern together. Something was clearly very wrong.

In fact, it turns out that at least two things were very wrong. The first (the subject of this post) was easy to fix and actually made the code a little more flexible. The second (the subject of the next post, which may be tomorrow) is going to be harder to work around.
The new() constraint

In my SteppedPattern type, I have a generic type parameter - TBucket. It's already constrained in terms of another type parameter, but that's irrelevant as far as I'm aware. (After today though, I'm taking very little for granted...) The important thing is that before I try to parse a value, I want to create a new bucket. The idea is that bits of information end up in the bucket as they're being parsed, and at the very end we put everything together. So each parse operation requires a new bucket. How can we create one in a nice generic way?

Well, we can just call its public parameterless constructor. I don't mind the types involved having such a constructor, so all we need to do is add the new() constraint, and then we can call new TBucket():

// Somewhat simplified...
internal sealed class SteppedPattern<TResult, TBucket> : IParsePattern<TResult>
    where TBucket : new()
{
    public ParseResult<TResult> Parse(string value)
    {
        TBucket bucket = new TBucket();

// Rest of parsing goes here
}
}

Great! Nice and simple. Unfortunately, it turned out that that one line of code was taking 75% of the time to parse a value. Just creating an empty bucket -
pretty much the simplest bit of parsing. I was amazed when I discovered that.

Fixing it with a provider

The fix is reasonably easy. We just need to tell the type how to create an instance, and we can do that with a delegate:

// Somewhat simplified...
internal sealed class SteppedPattern<TResult, TBucket> : IParsePattern<TResult>
{
private readonly Func<TBucket> bucketProvider;

Read more: Jon Skeet: Coding Blog
QR: optimization-and-generics-part-1-the-new-constraint.aspx