Implementing Memoize in Delphi 2009
My last series on Delphi 2009's generics was self-consciously investigating a corner case. This post, however, is going to bring us back to Earth very quickly, so prepare for a steep descent! I'm going to implement a useful, higher-order function, Memoize, using Delphi 2009's generics and anonymous methods.
Memoization is a generic solution to the problem of caching function results. The function Memoize accepts a function as an argument and returns a function which does exactly the same thing, except that it caches the results. Here's the prototype:
So my version of Memoize will accept any function which takes a single argument of any type, and returns a result of any type. Memoize returns a new function with exactly the same signature. I've often written systems to cache the results of database queries in my older Delphi code. Lately, I've been using TClientDataSet to do the same thing, but it still results in several lines of code to accomplish what is essentially a single function call. Memoize allows me to solve this problem one time and use it for as many situations as I can imagine.
Well, almost any function. Caching is obviously only useful for functions which are referentially transparent, which is another way of saying that their output is completely determined by their input. You wouldn't want a cached version of "Now()" or "Create()", for example. Memoization is most useful when a function performs a lengthy calculation, or queries a database or a web service with a lot of latency. In order to simulate such a function for the purpose of testing my implementation of Memoize, I've written the following test function:
This is useless, but it allows me to distribute a testable version of the project without having any dependency on any specific database or Web service. Feel free to substitute a more useful function if you try this yourself. :) In order to create a memoized version of this function, I just pass it to Memoize, and call the function that is returned instead of the original one:
The Memoize function certainly sounds useful, but how could I write such a thing? Well, I obviously need generics, or my version of Memoize would be limited to functions which only accepted and returned one specific type. And I need anonymous methods, because that is the only way that I can return new functions at runtime. Finally, I need some kind of cache to store pairs of arguments and results, so that successive executions with the same argument can find result in the cache.
Let's examine the last requirement first. Delphi 2009's Generics.Collections unit includes a TDictionary<TKey,TValue> type which, at first glance, it appears to do what I need. Unfortunately, it isn't lifetime-managed. So I will just subtype Delphi 2009's TDictionary<TKey,TValue> and implement a new IDictionary<TKey,TValue> interface I will create.
I will explain why it is important to have a lifetime-managed dictionary in order to implement this function at the end of this post. You can easily understand the rest of the post without grasping this detail, if you're willing to take my word on the fact that I need a lifetime-managed list. For the time being, let's just continue with the code. Here's the full source code for Memoize, which is actually very simple:
In Delphi 2009, you can call Exit with an argument in order to return a certain value from a function immediately, as if you had assigned Result and then called Exit without the argument. Other than that, I don't think there is too much to explain about the implementation above.
In order to prove that it works, I wrote a unit test. The unit test shows that the second execution of a Memoized version of a slow function for the same argument (1) returns the same value and (2) is more than two orders of magnitude faster. One of the reasons I included this unit test was to demonstrate that unit tests can be used to verify performance goals as well as functional specifications. In order to profile the function, I've "borrowed" Barry Kelly's useful TBenchmarker class. Here's the test:
You can download the full source code from CodeCentral.
Finally, here's the full explanation for why I needed to create a lifetime-managed dictionary:
Anonymous methods, in every implementation I've ever seen, can "capture" variables in the declaring method. This means that you can use variables (or method arguments) declared in the method which defines the anonymous method, without passing them as arguments. But since we are going to return the anonymous function as the result of the defining function, any object referenced in the anonymous function must not be freed by the defining function. In Delphi 2009, captured variables are captured by reference (this is similar to C#, but different than JavaScript). Recall that anonymous methods are reference counted. When an anonymous method captures a variable in the declaring method, it will be freed when there are no more references to the anonymous method, provided that the captured variable is a lifetime-managed type, such as a string, a record, an interface, etc.
Memoization is a generic solution to the problem of caching function results. The function Memoize accepts a function as an argument and returns a function which does exactly the same thing, except that it caches the results. Here's the prototype:
class function Memoize<A, R>(AFunc: TFunc<A, R>): TFunc<A, R>;
So my version of Memoize will accept any function which takes a single argument of any type, and returns a result of any type. Memoize returns a new function with exactly the same signature. I've often written systems to cache the results of database queries in my older Delphi code. Lately, I've been using TClientDataSet to do the same thing, but it still results in several lines of code to accomplish what is essentially a single function call. Memoize allows me to solve this problem one time and use it for as many situations as I can imagine.
Well, almost any function. Caching is obviously only useful for functions which are referentially transparent, which is another way of saying that their output is completely determined by their input. You wouldn't want a cached version of "Now()" or "Create()", for example. Memoization is most useful when a function performs a lengthy calculation, or queries a database or a web service with a lot of latency. In order to simulate such a function for the purpose of testing my implementation of Memoize, I've written the following test function:
class function TestTMemoize.SlowIncrement(ANum: integer): integer;
begin
Sleep(100); // simulate web service / db access
Result := ANum + 1;
end;
This is useless, but it allows me to distribute a testable version of the project without having any dependency on any specific database or Web service. Feel free to substitute a more useful function if you try this yourself. :) In order to create a memoized version of this function, I just pass it to Memoize, and call the function that is returned instead of the original one:
procedure DoStuff;
var
MemoizedFunc: TFunc<integer,integer>;
iResult: integer;
begin
MemoizedFunc := TMemoize.Memoize<integer,integer>(SlowIncrement);
iResult := MemoizedFunc(1);
end;
The Memoize function certainly sounds useful, but how could I write such a thing? Well, I obviously need generics, or my version of Memoize would be limited to functions which only accepted and returned one specific type. And I need anonymous methods, because that is the only way that I can return new functions at runtime. Finally, I need some kind of cache to store pairs of arguments and results, so that successive executions with the same argument can find result in the cache.
Let's examine the last requirement first. Delphi 2009's Generics.Collections unit includes a TDictionary<TKey,TValue> type which, at first glance, it appears to do what I need. Unfortunately, it isn't lifetime-managed. So I will just subtype Delphi 2009's TDictionary<TKey,TValue> and implement a new IDictionary<TKey,TValue> interface I will create.
I will explain why it is important to have a lifetime-managed dictionary in order to implement this function at the end of this post. You can easily understand the rest of the post without grasping this detail, if you're willing to take my word on the fact that I need a lifetime-managed list. For the time being, let's just continue with the code. Here's the full source code for Memoize, which is actually very simple:
class function TMemoize.Memoize<A,R>(AFunc: TFunc<A, R>): TFunc<A, R>;
var
Map: IDictionary<A, R>;
begin
Map := TManagedDictionary<A, R>.Create;
Result := function(arg: A): R
var
FuncResult: R;
begin
if Map.TryGetValue(arg, FuncResult) then begin
Exit(FuncResult);
end;
FuncResult := AFunc(arg);
Map.Add(arg, FuncResult);
Exit(FuncResult);
end;
end;
In Delphi 2009, you can call Exit with an argument in order to return a certain value from a function immediately, as if you had assigned Result and then called Exit without the argument. Other than that, I don't think there is too much to explain about the implementation above.
In order to prove that it works, I wrote a unit test. The unit test shows that the second execution of a Memoized version of a slow function for the same argument (1) returns the same value and (2) is more than two orders of magnitude faster. One of the reasons I included this unit test was to demonstrate that unit tests can be used to verify performance goals as well as functional specifications. In order to profile the function, I've "borrowed" Barry Kelly's useful TBenchmarker class. Here's the test:
procedure TestTMemoize.TestMemoize;
var
Cold, Warm: Double;
ColdResult, WarmResult, AnonResult: integer;
MemoizedSlowInc: TFunc<integer, integer>;
BenchmarkProc: TProc;
begin
MemoizedSlowInc := TMemoize.Memoize<integer,integer>(SlowIncrement);
BenchmarkProc := procedure()
begin
AnonResult := MemoizedSlowInc(1);
end;
// first execution should be slow
Cold := TBenchmarker.Benchmark(BenchmarkProc, 1, 0);
ColdResult := AnonResult;
// next time for same value should be fast
Warm := TBenchmarker.Benchmark(BenchmarkProc, 1, 0);
WarmResult := AnonResult;
CheckEquals(ColdResult, WarmResult);
CheckTrue(Warm < (Cold / 100));
end;
You can download the full source code from CodeCentral.
Finally, here's the full explanation for why I needed to create a lifetime-managed dictionary:
Anonymous methods, in every implementation I've ever seen, can "capture" variables in the declaring method. This means that you can use variables (or method arguments) declared in the method which defines the anonymous method, without passing them as arguments. But since we are going to return the anonymous function as the result of the defining function, any object referenced in the anonymous function must not be freed by the defining function. In Delphi 2009, captured variables are captured by reference (this is similar to C#, but different than JavaScript). Recall that anonymous methods are reference counted. When an anonymous method captures a variable in the declaring method, it will be freed when there are no more references to the anonymous method, provided that the captured variable is a lifetime-managed type, such as a string, a record, an interface, etc.

This is excellent. Yet another really cool use for anonymous methods! Another minor optimization you can do is to add the "static" directive to your generic class method. That will remove the implicit "Self" parameter to the function since it is clearly not needed. (Self in the case of a class method is a reference to the class *type*). By adding "static" it makes the method merely a scoped global procedure or function.
This will remove the added (I know, it's very, very minor) overhead of the MOV EAX, for the call to TMemoize.Memoize(). Since your goal was a performance boost, might as well grab all you can :-).
Allen.