Generics

.Generics

"Generics are used to write code that is reusable across different types. Generics can be used with methods, and the following types: class, structure, interface, and delegate. Generic code contains a type parameter (<T>) which specifies the data type which will be processed. Constraints can be added to type parameters to restrict the kinds of types which can be used for arguments. Starting in C# 4.0, variance was allowed during the assignment of type parameters for generic interfaces and generic delegates. The use of generic collections is preferred over the use of non-generic collections."

Prior to the addition of generics in C# 2.0, earlier versions of C# would cast types to and from the universal base type Object in order to broaden the group of data types which could be used with code segments. This broadening of the data types (a.k.a. generalization) was used heavily in creating collection classes which were designed for implementing various data structures (stacks, queues, etc). These earlier versions of collections are now referred to as non-generic collections. Most non-generic collections could hold any type (e.g. int, string, bool, float, etc) which were convenient, but also had two problems that were associated with this implementation of generalization:

1. Performance Penalty - especially when working with value types (e.g. numerical data). Value types would have to be boxed/unboxed when converting to/from reference types. This required the runtime to perform extra processing which could result in a significant performance penalty when working with large amounts of data.

2. Invalid Cast Exceptions - caused when the generalized data is used at runtime. The compiler was not able to ensure type compatibility, so Invalid Cast exceptions could occur when processing the data. For example, trying to perform a mathematical operation on a String type.

In C# 2.0 generics (a.k.a. Parametric Polymorphism) were added and used to refactor the code involving generalization, such as the collection classes. The initial versions of the collections classes are still available (System.Collections) for compatibility with prior code. But the generic versions of the collection classes (System.Collections.Generic) are now recommended for implementing collections in code that targeted .NET 2.0 or more recent releases. Besides its use in creating collections, generics also allow developers to create custom generic methods, classes, structs, interfaces, delegates, and events. Using generics can provide compile-time type safety, eliminate the performance penalty associated with boxing/unboxing, and simplify the code required to provide reusable solutions.

Top



History of .NET Collections

The history of .NET collections illustrates the problems generics were designed to solve. Following the evolution from Array to ArrayList to List<T> also provides a better understanding of how collections and generics work.

Arrays - Primitive Containers of Fixed Size

The earliest of high-level programming languages had support for arrays. Arrays have been widely used for such purposes as implementing mathematical matrices to storing tabular data. The problem with arrays is they hold a fixed number of entries. A runtime "Out of Bounds" exception occurs when an attempt is made to add an entry outside the declared size of the array. Specifically, in C# this causes an IndexOutOfRange Exception.

ArrayLists - Dynamically Sized as Required

Collections were designed to provide a simple way of creating commonly used data structures (lists, stacks, queues, etc). Collections solved the Array's problem of having a fixed size by performing dynamic storage allocation. Collections also included methods for manipulating the data in the structure (add, put, enqueue, etc) and supported properties containing information about the structure (such as Count). While the fixed-size limitation was solved with this first generation of collections (now referred to as "non-generic collections"), two other problems were discovered. The problems were caused because the collections were storing the data as "object" types.

Most non-generic collections can hold any type of data. Add an int, a float, a bool, a string ... the collection can hold them all. To implement this ability, the underlying code upcasts the data to an Object type. This works because in C# all the types are derived from the Object base type. But what does this implementation do to performance? What happens when we retrieve data from the collection and perform an operation that is not supported for that data type?

ArrayList - Non-generic Collection

using System;
using System.Collections;

namespace NongenericCollection
{
    class Program
    {
        static void Main ()
        {
            int quotient = 0;
            ArrayList myArrayList = new ArrayList();
            myArrayList.Add(3);
            myArrayList.Add(3.0);
            myArrayList.Add("three");

            foreach (Object obj in myArrayList)
            {
                System.Console.WriteLine(obj);
                quotient = 3 / (int) obj;     // InvalidCastException
            }
        }
    }
}

The first answer is performance can be poor. Especially If we are upcasting a value type (e.g. a number) to an object. This causes the runtime to box/unbox the values which can require a significant amount of processing when working with a large number of elements. The second answer is it will cause runtime exceptions. If we expect to be working with an integer, and perform and perform an integer operation on a string, a runtime exception will occur. If only there were some way to restrict the collection to a specific data type, without having to code a different version for each data type.

List<T> - Usually the Best Option

Enter generics and List<T>. Generic collections solve the problems of the boxing/unboxing performance penalty and the lack of type safety by using a type parameter. For example, if a generic collection knows it can only hold integer values, the runtime will not have to box an integer value and store it on the heap. Instead it can simply store it in the stack, resulting in improved performance. Also, if the compiler knows that the collection can only hold integers, it will generate a compiler error when you try to add a non-compatible type (such as a string).

.Invalid Type Compiler Error


Compile-Time Type Checking

Top



System.Collections.Generic

Generic collections are stored in the System.Collections.Generic namespace. Some of the commonly used generic collection from that namespace are:

  • Dictionary<TKey, TValue> - provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast because the Dictionary<TKey, TValue> class is implemented as a hash table.

  • LinkedList<T> - is a doubly linked list, where each node points forward to the Next node and backward to the Previous node.

  • List<T> - a list of data elements that can be accessed by index. Provides methods to search, sort, and manipulate lists.

  • Queue<T> - a first-in, first-out collection of objects. Data elements are inserted at one end and removed from the other.

  • SortedDictionary<TKey, TValue> - collection of key/value pairs that are sorted on the key. The underlying structure is a binary search tree.

  • SortedSet<T> - collection of data elements that is maintained in sorted order. Duplicate elements are not allowed. Changing the sort values of existing items is not supported and may lead to unexpected behavior.

  • Stack<T> - is a last-in-first-out (LIFO) collection. Accepts null as a valid value for reference types and allows duplicate elements.

Top



Generic Type Parameters

Generic code uses a type parameter which is commonly designated by <T>. The name of the T parameters are not significant. If only one parameter is used, it is usually named T unless a descriptive name is needed. When there are multiple parameters such names as TKey, TValue or T1, T2 are used. The type parameter must be specified when declaring a generic object. On invocation, the type parameter can be omitted if the compiler can infer the type, but it is a good habit to always specify the type parameter explicitly on invocation as well.

C# has rules for the usage of type parameters. These rules may refer to type parameters as being "open", "closed", "unbound", or "constructed". Below is a definition of these terms as they relate to relate to generic type parameters.

  1. open - contains unresolved type parameters (e.g. MyClass <T>)
  2. closed - all type parameters have been resolved (e.g. MyClass <String>)
  3. unbound - contains no type arguments (e.g. MyClass <>).
  4. constructed - declaration contains at least one specified type argument . (e.g. MyClass <String,T>).

Note1: Open generic types are Closed as part of compilation.
Note2: The only way to have an unbound generic type is with the "typeof" operator (e.g. typeof(MyClass<>)

The Dictionary type:

Dictionary<String, T>

in the example below is considered an open constructed type. "Open" because it contains unresolved type parameters (T). "Constructed" because the declaration contains at least one specified type argument (String).


Dictionary<T> - Custom Cat Behaviors Dictionary

using System;
using System.Collections;
using System.Collections.Generic;

namespace GenericExample
{
    class Program
    {
        public class CatBehaviors<T> : Dictionary<String, T> { }

        static void Main ()
        {
            CatBehaviors<bool> MyCatsBehaviors = new CatBehaviors<bool>();
            MyCatsBehaviors.Add("Crazies", true);
            MyCatsBehaviors.Add("Aloof", false);

            foreach (KeyValuePair<String, bool> kvp in MyCatsBehaviors)
                Console.WriteLine(kvp.Key + " is " + kvp.Value);
            Console.WriteLine();
        }
    }
}

Top



Type Parameter Constraints

A important consideration in designing generic code is determining which types to allow as type parameters.The more types you allow, the more flexible and reusable the code. However allowing more types can also increase code complexity. Constraints can be applied to type parameter declarations which will restrict the types which can be used with the generic code. A rule of thumb in determining which types to support in generic code is:

Apply the maximum constraints possible which will still let you handle all the necessary types.

Constraints are added to type parameters through the use of the where keyword. Multiple constraints can be applied to the same type parameter. The following table lists the available constraints.

Constraint Description
where T: struct The type argument must be a value type.
where T : class The type argument must be a reference type.
where T : new() The type argument must have a public parameterless constructor. When used together with other constraints, the new() constraint must be specified last.
where T : <base class name> The type argument must be or derive from the specified base class.
where T : <interface name> The type argument must be or implement the specified interface. Multiple interface constraints can be specified. The constraining interface can also be generic.
where T : U The type argument supplied for T must be or derive from the argument supplied for U.


Top



Type Parameter Default Values

Assigning an appropriate default value to parameterized type depends upon the type used to resolve the parameter. For example,

t = null;  (is only valid for reference types)
t = 0;  (is only valid for numeric value types)

To address this issue C# defined another use for the default keyword (besides its use in switch statements). The "default" keyword will return null for reference types and zero for numeric value types. For structs, each member is initialized to zero (for value types) or null (for reference types).

t = default(T);  // returns null for reference types and zero for value types


Top



Variance in Generic Types

The ability to create strongly typed generic collections was a big improvement over the untyped non-generic collections. However strongly typed collections have their own disadvantages when working with inheritance. Starting in C# 4.0 this disadvantage was addressed by providing support for "variance" with generic interfaces and generic delegates type parameters (generic classes do NOT currently support variance). Variance allows types which support inheritance to be assigned other types which are less or more derived than itself.

For example, if you have the following class declarations where Animal is the base class and Mammal is derived from Animal:

public class Animal
public class Mammal : Animal
  • If invariant (no variance) then: "An Animal type can only be assigned to another Animal type" and "A Mammal type can only be assigned to another Mammal type". All value types are invariant.

  • If covariance is supported, then "A Mammal type can be assigned to an Animal type". Or stated another way, "you can treat a Mammal like it is an Animal".
    Animal a = new Mammal();   // Covariant
  • If contravariance is supported, then "An Animal type can be assigned to a Mammal type". Or stated another way, "you can treat an Animal like it is an Mammal".
    Mammal m = new Animal();    // Contravariant

Variance Definition

"Only generic interface types and generic delegate types can have variant type parameters. An generic interface or generic delegate type can have both covariant and contravariant type parameters."

Variance can be defined in a more general sense as:

  • Invariant
    • Only another instance of the exact type can be assigned to the type. Can not use a type that is either more or less derived.
  • Covariant
    • Enables you to assign a more derived type to a less derived type.
    • "Co" means go with; i.e. Go with the natural order by going down the inheritance hierarchy.
    • Designated with the out keyword, for example:  public interface IEnumerable<out T> : IEnumerable
  • Contravariant
    • Enables you to assign a less derived type to a more derived type.
    • "Contra" means go against; i.e. Go against the order by going up the inheritance hierarchy.
    • Designated with the in keyword, for example: public interface IComparer<in T>
Top



Generic Methods

Generics can be used to create reusable methods which are more concise than traditional method overloading. Instead of creating several method overloads, which requires a definition for each data type it works with, a single generic method can accomplish the same function. For example, if creating a method that swaps the values stored in two variables, using method overloading you would need to create:

  1. A method that works with ints
  2. A method with the same name that works with floats
  3. A method with the same name that works with strings
  4. A method with the same name that works with bool
  5. A method with the same name that works with ... so on for each required data type

Another approach for creating the reusable method is to generalize all types to the universal base Object type. This was the approach used for creating the non-generic collection classes. This approach is not type safe and can cause poor performance due to boxing and unboxing of value types to and from reference types.

The best approach is to use a generic method. The generic method for performing the swapping logic is as follows:

.Generic Swap Method

Generic Swap Method

using System;
using System.Collections.Generic;

namespace GenericSwap
{   
    class Program
    {
        public static void Swap<T>(ref T x, ref T y)
        {
            Console.WriteLine("-->Swapping type is: {0}", typeof(T));
            T temp;
            temp = x;
            x = y;
            y = temp;
        }

        static void Main(string[] args)
        {
            Console.WriteLine("-----  Generic Swap Method  -----\n");

            // Swap Integers
            int i1 = 8;
            int i2 = 77;
            Console.WriteLine("Before Swap i1 = {0}, i2 = {1}", i1, i2);
            Swap<int>(ref i1, ref i2);
            Console.WriteLine("After Swap i1 = {0}, i2 = {1}\n", i1 , i2);

            // Swap Booleans;
            bool b1 = true;
            bool b2 = false;
            Console.WriteLine("Before Swap b1 = {0}, b2 = {1}", b1, b2);
            Swap<bool>(ref b1, ref b2);
            Console.WriteLine("After Swap b1 = {0}, b2 = {1}\n", b1, b2);
        }
    }
}

Top


Reference Articles



Top