Building a Generic Statistics Library, Part 4: Map

Posted by on in Blogs
It's been a busy week here in Columbus, what with the highest wind speeds ever recorded in the city and no power in my house — or 1/2 million others — since Sunday.  But don't feel too bad for me; a lot of people have it worse. Anyway, this will be a fairly short post.

This post is the fourth in my series on building a statistics library using Delphi 2009's new generic types.  If you're new to the series, you might want to start with the first post.  Today I'm going to discuss the second higher-order function that the library uses internally, Map.

Map is a function which takes a list, and a function which operates on individual items in the list, and returns a new list, formed by executing the function against each entry in the original list.  For example, suppose I had this list of integers:
1, 2, 3, 4, 5

Now let's say that I call Map, passing this list, and the function Inc(). The result would be this list:
2, 3, 4, 5, 6

Don't confuse Map with functions like jQuery's each which merely iterate the list and call a passed function without returning a new list.  That's a different kind of function.

To calculate the standard deviation of a list, I first need to create a list of the squares of the deviations of each entry in the list. Later, I will perform additional calculations to compute the standard deviation, but first I need this list.  Here's how I do it, using Map:

  SquaredDeviations := Map(function(AValue: T): T
begin
Result := ASquare(ADeviation(AValue, Avg));
end,
AData);
// [...]


The anonymous function I pass as an argument to Map computes the deviation of the individual item it is passed, squares it, and returns a value.  This anonymous function is called once for each item in the original list. And here is the implementation of Map itself:

class function TStatistics<T>.Map(
AFunc: TUnaryOp<T>;
const AData: TEnumerable<T>): TEnumerable<T>;
var
Current: T;
NewList: TList<T>;
begin
NewList := TList<T>.Create;
for Current in AData do begin
NewList.Add(AFunc(Current));
end;
Result := NewList;
end;


Note that I have Map returning a new instance.  It is behaving as a constructor of sorts.  Indeed, it might make sense to have a constructor of this style on a list type. You could also write a version of Map which accepted an entity list to use as the result if it makes your lifetime management easier.

In the final post in this series, I'll use the infrastructure I've developed to implement Average and StandardDeviation.
Tags: Delphi


Comments

  • Guest
    Raymond Wednesday, 17 September 2008

    Hi Craig,

    You know, it's going to take a while to get used to code that looks like this:

    SquaredDeviations := Map(function(AValue: T): T
    begin
    Result := ASquare(ADeviation(AValue, Avg));
    end,
    AData);

    Regards the map() function itself, it's not so much returning a new instance, but it is returning a new TList instance, rather than an intance of the TEnumerable derived type. To be truly generic (sic) the map function either needs to be given the list (as yuou suggest), or the type of the required resultant list. The former is parhaps better for lifetime management smoothness (again, as you suggest) and also allows more complex construction semantics for the result list prior to passing to Map().

    One minor performance pick ;-) Given this code I would add a line:

    NewList.Capacity := AData.Count;

    after the construction of newlist to remove the potentially many resizes of the list.

    On reflection, I remember you saying that TEnumerable does not support Count (and suggested using a Fold() variant to implement a count). This is possibly another reason to provide Map with the result set.

  • Guest
    Raymond Thursday, 18 September 2008

    By 'Ideally I'd like to return an IEnumerable ... rather than constructing an actual list', are you suggesting that the Map() implementation actually be nothing more than an enumerator that calculates individual results from the given list when the caller sucks on it? This would have the, IMO, undesirable effect of synchronising the processing with the act of retrieving the result (something like a Quantum Mechanics approach ;-) ).

    Otherwise, I'm not sure how returning an IEnumerable helps as this otherwise needs to be an interface on a concrete container of results which must be constructed somewhere...

  • Guest
    Raymond Thursday, 18 September 2008

    I agree, the caller has no business in dictating the internal Map implementation. I was just trying to get my head around how returning an IEnumerable to the caller helps with resolving the issue with the sample implementation above constructing the result set.

    Ultimately the issue is that delegating creation and management of the result set to the Map() function removes some of the type safety inherent in the approach. The caller is required to perform additional type checks on the result set and the result set is no more than a collection of mutated instances of the original type T (which may not be particularly useful to the caller, or appropriate to the operation being performed).

    Anyhoo, this is really just nitpicking and not related to your aim of demonstrating generics. Keep up the good work!

  • Please login first in order for you to submit comments
  • Page :
  • 1

Check out more tips and tricks in this development video: