Comparing C#, C++, and Delphi (Win32) Generics

Posted by on in Blogs
C#, C++, and Delphi all have a generic type and method language feature. Although all three languages are statically typed, they implement generics in very different ways. I'm going to give a brief overview of the differences, both in terms of language features and implementation. I presume that Delphi Prism generics work essentially the same as C# generics, which, as you’ll see, is different than Delphi/Win32 generics.

Let me say at the outset that although all three systems work somewhat differently, I don't see an overwhelming advantage to any one design. Generally, you can do what you need to do in all three environments. I'm writing this article not to claim that any one system is better than the others, but to point out some of the subtleties in the implementations.

Before I get started, I'd like to thank Barry Kelly for his useful feedback on my first draft of this article.

Compiling Instantiations


Every implementation of generic types works via a two-step process. First, you define a generic type or method with a "placeholder" for a specific type, which will be substituted later on. Later (exactly when depends upon the language), the type is "instantiated." Note that instantiating a generic type is very different from instantiating an object. The former happens within the compiler, whereas the latter happens at runtime.

Instantiation is triggered when some code uses a generic type or method with a specific type parameter, and means that based upon the generic definition and the types or values passed when the generic is used, a specific implementation is substituted in order to allow the generation of machine code. Instantiation is one of the most important differences between real generic types and using non-generic types with casts. In the end, different machine code is generated for instantiations for different type parameters.

In C# and Delphi, there is a language feature which is solely dedicated to implementing generic types and methods. In C++, on the other hand, the "templates" language feature can be used to implement generic types and methods, among many, many other things. It is even possible to do general-purpose programming using templates, which C++ programmers call "metaprogramming."

C++ templates require the template source code to be available when the code using the template is compiled. This is because the compiler does not actually compile the template as a separate entity, but rather instantiates it "in-place" and only compiles the instantiation. The C++ compiler is effectively doing code generation, substituting the type parameter (or value) for the placeholder for the type, and generating new code for the instantiation. Update: Moritz Beutel elaborates on this in his excellent comment on this post. You should read the full comment, but the short version is that the manner in which templates are compiled can result in errors in the code which uses the template appearing (from compiler error messages), incorrectly, to be errors in the template itself. Moreover, the implementation of most C++ compilers makes this problem even worse than what is necessary in order to implement the C++ standard.

In Delphi and C#, on the other hand, the generic type or method in the code which uses the generic type or method can be compiled separately. Therefore, you can compile a library which contains a generic type, and later on compile an executable which uses a instantiation of that type and has a reference to the binary library, rather than to the source code for the library.

Another way to think of this difference is that in C++, a template will not be compiled at all until it is used. In Delphi and C#, on the other hand, a generic type or method must be compiled before it can be used.

In Delphi, the compiler uses a feature closely related to the method inlining feature. This causes the compiler to store the relevant bits of the abstract syntax tree for the generic type parameter in the compiled DCU. When the code which uses the generic type is compiled, this bit of the abstract syntax tree is read and included in the abstract syntax tree for the code which uses the generic type, so that when machine code is produced, based on the new, “compound” abstract syntax tree, it looks, to the code emitter, like the type was defined with the type parameter "hard coded." Instead of linking to compiled code in the DCU, the code which uses the generic type emits new code for the instantiation into its own DCU.

Because generic instantiation is performed in the same area of the Delphi compiler which does method inlining, there are some limitations on what you can do in a generic method, or a method of a generic type. Like inlined methods, these methods cannot contain ASM. Also, calls to these methods cannot be inlined. These restrictions are limitations of the implementation, not of the language design, and could theoretically be removed in a future version of the compiler.

C# generics use the .NET Framework 2.0+, which has native support for generic types. The C# compiler emits IL which specifies that a generic type should be used, with certain type parameters. The .NET framework implements these types using one instantiation for any reference type, and custom instantiations for value types. (Don’t confuse “instantiation” with “instance” in the preceding sentence; they mean entirely different things in this context. There are usually many instances of one instantiation.) This is because a reference to a reference type is always the same size, whereas value types can be many different sizes. Later, the IL will be JITted into machine code, and, as with compiled C++ or Delphi code, types don’t really exist at the machine code level. In .NET, generic type instantiation and JITting are two distinct operations.

So one important difference in generics implementations is when the instantiation occurs. It occurs very early in C++ compilation, somewhat later for Delphi compilation, and as late as possible for .NET compilation.

Custom Specializations


Another very important difference is that C++ allows custom instantiations, called specializations, including specializations by value. With C# and Delphi, on the other hand, the only way to instantiate a generic type is to use that type with an explicit type parameter. The implementation will always be the same, with the exception of the type of the type parameter. Because C++ allows custom instantiations, it is easy for a programmer to write different implementations of a method, for example, for different integer values. Like operator overloading, this is a powerful feature which requires considerable self-restraint to avoid abuse.

Constraints


Delphi and C# both have a generic constraint feature, which allows/requires the developer of a generic or method type to limit which type parameter values can be passed. For example, a generic type which needs to iterate over some list of data could require that the type parameter support IEnumerable, in either language. This allows the developer of the generic type to make her intentions for the use of the type very clear. It also allows the IDE to provide code completion/IntelliSense on the type parameter, within the definition of the generic type. Also, it allows a user of the generic type to be confident that they are passing a legal value for the type parameter without having to compile their code to find out.

In C++, on the other hand, there is not presently any such feature. A more powerful/complex feature called "concepts" was considered for, but ultimately removed from, C++0x.

An implication of the lack of constraints is that C++ templates are duck typed. If a generic method calls some method, Foo on a type passed as the generic type parameter, then the template is going to compile just fine so long as the type parameter passed contains some method called Foo with the appropriate signature, no matter where or how it is defined.

Covariance and Contravariance


Let’s say I have a function which takes an argument of type IEnumerable<TParent>. Can I pass an argument of type IEnumerable<TChild>; to that function? What if the argument type were List<TParent>; instead of IEnumerable<TParent>? Or what if the generic type was the function result rather than the function argument? The formal names for these problems are covariance and contravariance. The precise details are too complicated to explain in this article, but the examples above summarize the most common times you run into the problem.

Delphi generics and C++ templates do not support covariance and contravariance. So the answers to the questions above are no, no, and no, although there are, of course, workarounds, like copying the data into a new list. In C# 4.0, function arguments and results can be declared covariant or contravariant, so the examples above can be made to work where appropriate. "Where appropriate" involves non-trivial subtleties hinted at above, and exemplified by the fact that arrays in .NET have (intentionally) broken covariance. However, the BCL routines in the .NET Framework 4.0 have been annotated to support covariance and contravariance when appropriate, so developers will benefit from the feature without having to fully understand it.
Comments are not available for public users. Please login first to view / add comments.