Language Friendly Type Names
.NET uses Type everywhere to represent type information.Not surprisingly Type is language-agnostic. In many cases it is useful to get the friendly (aka language-specific) name for a Type object. .NET does not provide this easily. There are several different approaches but none of them work really well if you want the name that would have been used in your favorite language. This post will discuss some of the options available and then provide a more general solution to the problem that doesn’t actually require much effort.
Different Type Categories
Before discussing the options available it is useful to summarize the different categories of types. Each category of types can potentially result in different syntax for the type name. Furthermore many of the existing options do not work for all the categories. Therefore we will define each of the categories of types and what kind of output we would expect. For simplicity we will use C# as the example language but the same concept applies to other languages.
- Primitives – This includes any type that is implicitly known by the language. Most primitives have aliases in a language. For example
Int32
is aliased asint
in C# andInteger
in VB. - Arrays – This includes both single and multi-dimensional arrays (rectangular arrays). It also includes arrays of arrays (jagged arrays) which may have a different syntax depending upon the language. For example C# uses a different syntax for rectangular arrays than it does for jagged arrays.
- Pointers – This includes pointers to any other category of types. Some languages do not support pointers.
- Closed generic types – This includes any generic type that has all type parameters specified with a type (i.e.
List<int>
). - Open generic types – This includes any generic type where one or more type parameters do not yet have a type (i.e.
IList<T>
). - Nested types – This includes any type nested inside another type (generally a class).
- Nullable types – This includes any value type in addition to primitives. Some languages (like C#) have a special syntax for nullables.
- Simple– All other types, including normal value and reference types, require no special consideration. The type name is language friendly.
It is important to remember that a full type can be in several different categories. Building the friendly name requires the type to be broken down, in the correct order, into its individual categories. For example List<int>[]
is an array of closed generic types where the type parameter is a primitive.
Available Approaches
The following approaches are available in the framework currently. Each has advantages and disadvantages.
- Type.Name – The
Type
class has a property (actually two) to get the type name. But the returned string is the formal framework name. For example primitives are returned as the formal .NET type and generic types include the type parameter numbers. - CodeDomProvider.GetTypeOutput – The CodeDOM provides a method to get the type name given a CodeTypeReference. This is the currently recommended approach to this this problem. There are several issues with this approach though. The CodeDOM is not lightweight especially when you need to create the provider. The CodeDOM also requires a
CodeTypeReference
so the type has to be converted. Here’s the code required to do this.var provider = new Microsoft.CSharp.CSharpCodeProvider(); var typeRef = new System.CodeDom.CodeTypeReference(typeof(Int32)); var actual = provider.GetTypeOutput(typeRef);
Unto itself it is not too bad but if performance is important then the CodeDOM is going to hurt. Even worse however is that the method returns back a fully-qualified type name for each type. If you don’t want or need the full name then you’d have to parse out the resulting string. It also does not handle nullable types.
- TypeName – VB provides a helper function that can get a type name. While it is a function for VB it is also callable by using Microsoft.VisualBasic.Information.TypeName. Unfortunately it requires an instance of the type and not just the type name.
None of the above approaches really work well. What we want is to be able to pass any type to a method and get back the string equivalent. Since different languages use different syntax we will need to identify the language we want to use as well. It should be very fast and handle all the categories mentioned earlier. Since none of the above approaches are very good we will create our own and it is surprisingly easy once you’ve broken the problem down.
Type Name Provider
Converting a type to a string involves two separate components: processing and formatting. During processing the type is taken apart to identify what category it is. This can be recursive as more complex types, like arrays, are broken up into their subtypes. Processing is the same irrelevant of the target language, in general. Formatting, on the other hand, requires a target language. It involves converting the category to the language specific syntax.
To keep things simple, but flexible, a simple abstract class called TypeNameProvider
will contain the processing workflow. Derived types can override the workflow if needed. Each type category will have its own abstract method for formatting. Derived types will provide the language-specific implementation.
public abstract class TypeNameProvider { public string GetTypeName ( Type type ) { if (type == null) throw new ArgumentNullException("type"); return GetTypeNameCore(type); } protected virtual string GetTypeNameCore ( Type type ) { //Do processing of type } // Abstract format methods }
For C# the language provider is called CSharpTypeNameProvider
and simply derives from TypeNameProvider.
Processing – Simple Types
Simple types include any type not handled by any other category, including primitives. During processing simple type formatting will be the default behavior if none of the other categories are found.
protected virtual string GetTypeNameCore ( Type type ) { return ProcessSimpleType(type); } protected virtual string ProcessSimpleType ( Type type ) { return FormatSimpleType(type); } protected virtual string FormatSimpleType ( Type type ) { return type.Name; }
Formatting a simple type just returns the type name. Primitives, which are language-specific can be handled in this way as well. For C# the primitive types are stored in a static dictionary along with the alias. A lookup is done on the simple type and the alias returned, if found.
protected override string ProcessSimpleType ( Type type ) { string alias; if (s_aliasMappings.TryGetValue(type, out alias)) return alias; return FormatSimpleType(type); } private static readonly Dictionary<Type, string> s_aliasMappings;
Processing – Generic Types
Generic types are a little harder to handle. IsGenericType determines if a type is a generic type (open or closed). IsGenericTypeDefinition is true if the type is open or false if it closed. Combining these calls will identify a closed generic type that can be processed.
if (type.IsGenericType && !type.IsGenericTypeDefinition) return ProcessClosedGenericType(type);
To process the type the base type needs to be extracted along with each of the type arguments. The information will then be passed to the format method for final processing.
protected virtual string ProcessClosedGenericType ( Type type ) { var baseType = type.GetGenericTypeDefinition(); var typeArgs = type.GetGenericArguments(); return FormatClosedGenericType(baseType, typeArgs); }
The C# implementation of the formatting is implemented like this.
protected override string FormatClosedGenericType(Type baseType, Type[] typeArguments) { var argStrings = String.Join(", ", from a in typeArguments select GetTypeName(a)); //Format => Type<arg1, arg2, ...> return String.Format("{0}<{1}>", RemoveTrailingGenericSuffix(GetTypeName(baseType)), argStrings); }
Processing – Nullable Types
Now that generic types are out of the way handling nullable types is simply a matter of special casing the base type. This could be handled by the language provider but since several languages provide a special syntax we will handle the logic in the base type for processing generic types.
if (baseType.IsValueType && baseType.Name == "Nullable`1") return FormatNullableType(typeArgs[0]);
The C# implementation looks like this.
protected override string FormatNullableType ( Type type ) { //Format => Type? return GetTypeName(type) + "?"; }
Processing – Arrays
Arrays are complicated by the fact that they can be single, multi-dimensional or jagged. Arrays need to be processed before most other types.
if (type.IsArray) return ProcessArrayType(type);
Processing an array requires separating the element type (which may be an array) from the array definition and then identifying the number of dimensions the array has.
protected virtual string ProcessArrayType ( Type type ) { var elementType = type.GetElementType(); var dimensions = type.GetArrayRank(); return FormatArrayType(elementType, dimensions); }
For C# a multi-dimensional array requires inserting a comma for each dimension past the first. The implementation looks like this.
protected override string FormatArrayType ( Type elementType, int dimensions ) { //Format => Type[,,,] return String.Format("{0}[{1}]", GetTypeName(elementType), new string(',', dimensions - 1)); }
Processing – Pointers, ByRef
Not all languages support a pointer but it is still a valid .NET type. Additionally there is a special category for by ref types. These need to be handled early in the processing because the modifier needs to be stripped so the base type can be processed.
if (type.IsPointer) return ProcessPointerType(type); if (type.IsByRef) return ProcessByRefType(type);
Processing either of these types requires getting the element type and then formatting it appropriately.
protected virtual string ProcessByRefType ( Type type ) { var refType = type.GetElementType(); return FormatByRefType(refType); } protected virtual string ProcessPointerType ( Type type ) { var pointerType = type.GetElementType(); return FormatPointerType(pointerType); }
The C# implementation looks like this.
protected override string FormatByRefType ( Type elementType ) { //C# doesn't have a byref syntax for general types return GetTypeName(elementType); } protected override string FormatPointerType ( Type elementType ) { //Format => Type* return GetTypeName(elementType) + "*"; }
Processing – Other Categories
My ultimate goal for originally writing this code was to be able to generate cleaner T4 template code. Therefore the code was not written to handle categories of code that wouldn’t appear in a template. The following types are not supported but the code could be modified to support them rather easily if desired.
Nested types – The general guideline is that a nested type is not publicly available. As such exposing a public, nested type in a T4 template does not make a lot of sense and is not supported. A nested type will be displayed as ParentType+NestedType. It would be relatively easy to handle a nested type by breaking up the parent from the child and replacing the plus sign with the separator for the language.
Open generic types – As of yet it has not been necessary for me to generate an open type in T4 and therefore the code does not support this category. To support an open generic type it is necessary to modify the generic type code a little. If the type is generic but it is a generic type definition then instead of retrieving the type arguments the type parameters would be used instead. Keep in mind that an open type may have a mix of arguments and parameters. Where it would get more difficult is with constraints. Each constraint would have to be generated as well. For each constraint it could be a constraint keyword (depending upon the language) or a type requirement.
Namespaces
The above code should now work with any type that might be found in the framework. But there is one more scenario that ideally will be handled. In a T4 template we cannot always assume that the namespace for a type is included. It makes sense to allow an option to include the namespace on the type. Because of the recursive nature of the provider we really only need to add the namespace in the FormatSimpleType
method.
public bool IncludeNamespace { get; set; } protected virtual string FormatSimpleType ( Type type ) { return IncludeNamespace ? type.FullName : type.Name; }
Simplifying the Usage
We want this to be as easy as possible so it makes sense to create an extension method off of Type
that returns the language friendly name. Because I’m a C# developer the base extension will assume C# but overloads will be provided to allow for other providers. Here’s the basic code.
public static class TypeExtensions { public static string GetFriendlyName ( this Type source ) { return GetFriendlyName(source, new CSharpTypeNameProvider()); } public static string GetFriendlyName ( this Type source, bool includeNamespace ) { return GetFriendlyName(source, new CSharpTypeNameProvider() { IncludeNamespace = includeNamespace }); } public static string GetFriendlyName ( this Type source, TypeNameProvider provider ) { if (provider == null) throw new ArgumentNullException("provider"); return provider.GetTypeName(source); } }
Enhancements
There are several enhancements that can be made to the code if desired.
- A type provider is pretty static in its behavior. Currently the namespace option is a property on the provider. It might be better to move the option into a simple options type that can be passed to the provider as an argument. This would make it easier to add other options later on.
- Since the provider is static in its behavior it might be useful to expose a singleton instance that can be used rather than requiring a new instance to be created each time.
- You could go further and expose “standard” providers off of a type (like StringComparer does).
- Add support for nested types by replacing the plus sign with the parent type name.
- Add support for open generic types by processing both the type arguments and the type parameters along with the constraints.
Download the Code