Home > .NET, CLR > .NET Metadata Tokens

.NET Metadata Tokens

Have you ever wondered why System.Type derives from System.Reflection.MemberInfo? Why does its inheritance hierachy look this way?

System.Object
System.Reflection.MemberInfo
System.Reflection.EventInfo
System.Reflection.FieldInfo
System.Reflection.MethodBase
System.Reflection.PropertyInfo
System.Type

We can find the answer to that by looking at the MSIL representation of members. When we write MSIL by hand and want to reference a member (that is a type, a field, a property, an event…) we do so by providing it’s fully qualified name as a string. But this is not how those references get actually stored inside the assembly when we put it through ILAsm (just imagine how inefficient it would be). What ILAsm does under the hood is generating token values for each unique reference we make and make a corresponding entry in one of the numerous Metadata tables.

Tools like Reflector or ILDASM resolve these tokens for the readers convenience, so usually you never get to see them.

Here’s an example C# code snippet:

static void Main(string[] args)
{
	var x = new DateTime();
	var y = new StringBuilder();
}

When we look at it in ILDASM (or reflector for that matter) we get the following output:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       16 (0x10)
  .maxstack  1
  .locals init ([0] valuetype [mscorlib]System.DateTime x,
           [1] class [mscorlib]System.Text.StringBuilder y)
  IL_0000:  nop
  IL_0001:  ldloca.s   x
  IL_0003:  initobj    [mscorlib]System.DateTime
  IL_0009:  newobj     instance void [mscorlib]System.Text.StringBuilder::.ctor()
  IL_000e:  stloc.1
  IL_000f:  ret
} // end of method Program::Main

But that’s only the default configuration. You can make ILDASM output the token values using the View->Show Token Values option.

.method /*06000001*/ private hidebysig static
        void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       16 (0x10)
  .maxstack  1
  .locals /*11000001*/ init ([0] valuetype [mscorlib/*23000001*/]System.DateTime/*01000013*/ x,
           [1] class [mscorlib/*23000001*/]System.Text.StringBuilder/*01000014*/ y)
  IL_0000:  nop
  IL_0001:  ldloca.s   x
  IL_0003:  initobj    [mscorlib/*23000001*/]System.DateTime/*01000013*/
  IL_0009:  newobj     instance void [mscorlib/*23000001*/]System.Text.StringBuilder/*01000014*/::.ctor() /* 0A000011 */
  IL_000e:  stloc.1
  IL_000f:  ret
} // end of method Program::Main

As we can see, our method is identified by the token 11000001, the assembly mscorlib by 23000001 and System.DateTime by 01000013 and System.Text.StringBuilder by 01000014.
Each token consists of 4 bytes, while the most significant byte identifies the metadata table where the reference is stored. In the case of System.DateTime this is 0x01. The other three bytes store a RID, a record identifier in that table. The RID is a simple zero based sequence number and is used like a primary key in a database table. The entry for System.DateTime is 0x000013.
We can confirm that ILDAsm did its job of displaying us human friendly names by looking at the Metadata tables that carry the reference information (View->MetaInfo->Show, or simply Ctrl+M).

TypeRef #19 (01000013)
——————————————————-
Token: 0x01000013
ResolutionScope: 0x23000001
TypeRefName: System.DateTime

The Resolution scope is mscorlib, as we can easily infer from the entry at 0x23000001.

Ok, now that we understand how member references are stored it is time to return to the inheritance hierachy of abstract class System.Reflection.MemberInfo. Since tokens represent member references in a uniform manner, it makes sense to built the reflection APIs around the notion of an abstract MemberInfo that can carry abribtrary, you guessed it, member information. Similar to the first byte of the token, MemberInfo has a property called MemberType that indicates the actual type this MemberInfo is. MemberInfo therefore streamline working with binary MSIL such as when directly manipulating a MSIL stream returned from via MethodInfo.GetMethodBody.GetILAsByteStream().

The APIs to resolve a token we encounter in the MSIL stream are provided on the Module class, e.g. Module.ResolveMemberInfo(). ResolveMemberInfo() is useful to resolve a token regardless of it’s type, or when you do not know the type of the token in advance.

More information can be found on MSDN and in the ECMA-335 standard.

Categories: .NET, CLR
  1. No comments yet.
  1. No trackbacks yet.

Leave a comment