Just recently I had to write code in an area that seems seldom discussed on the Internet. I wanted to access information about a .Net assembly that is normally accessible via Reflection. The catch was I needed to do this from an “unmanaged” program. That’s right, a genuine native Windows executable!
I knew this was possible. ILDASM is a perfect example of a native Windows application interrogating a .Net assembly. The way this is done, is via the Metadata Unmanaged API .
The point of entry for using this API, is the IMetaDataDispenser / IMetaDataDispenserEx interface. Using the standard technique of CoCreateInstance with the appropriate Class ID will give you a reference to it. Unlike a lot of COM objects, IMetaDataDispenser is not exposed through a type library. As such, you will need to find a declaration for it (and its Class ID). If you’re programming in C++ , this declaration is found in cor.h. I’m not.
Turning to the Internet, I found cor.h on the koders web site.
As an aside: Of course, this meant I had to translate the relevant parts of cor.h into Delphi/Pascal code. I did try one free / open source conversion tool. Comically, it translated the braces into begin and end statements and nothing else. I have not written C programs in ten years! Still, I managed a more convincing translation than the software I tried!
The IMetaDataDispenser interface “dispenses” the other interfaces required to interrogate and manipulate the meta-data. Think of it as using an abstract factory pattern. The “easiest” way to access the meta-data is through the IMetaDataImport interface. IMetaDataDispenser provides an OpenScope method, which takes an assembly file-name parameter and can “return” a reference to an IMetaDataImport. Here is some pseudo code showing this:
if metadataDispenser.OpenScope(“assemblyFileName.dll”, 0, IID_IMetaDataImport, ppIUnk) == S_OK metadataImport = (IMetaDataImport)ppIUnk;
Although Meta-data can quite elegantly be described in terms of objects, the interface access to it is distinctly functional. The meta-data provides information about an assembly’s make up. This includes classes, methods of classes, interfaces, properties, custom-attributes and so on. Each element that is described has a token. When using the IMetaDataImport interface, you will refer to the element, by using this token.
One thing that eluded me for a while is the importance of the token’s value. A token is an unsigned 32 bit value. The most significant byte of the token determines the sort of token that it is. That is, if the MSB’s value is 0×02 then the token describes a “type definition” (or “mdTypeDef”), or when the value is 0×0C the token describes a custom-attribute. The full list of token types is described on the MSDN web-site. The reason this is critical, is that some functions return related token values. Furthermore, the one function may return tokens of different types! You need to determine the type of token returned before you attempt to use the returned token in further operations. Pass the wrong sort of token to a function, and the function will fail.
If you are reading the meta-data, most likely, you will want to run through all the available somethings. To do this, you will need to use the appropriate “Enum” function. The ones that I have used all work in pretty much the same manner: They take a packed array in which to return the appropriate tokens and you also specify how many elements your array has (to prevent over-runs). One of the “out” parameters tells you how many elements the function returned. You can keep calling the function until the out parameters tell you there are no elements left. Time for another pseudo code example. This one enumerates the type definitions in the assembly. With this basic loop, you would be able to perform further interrogation on all the classes defined in the assembly:
enumerationHandle = 0 do if metaDataInfo.EnumTypeDefs(enumerationHandle, tokens, 5, numTokensReturned) == S_OK // Process the tokens in any way you see fit. while numTokensReturned > 0; metaDataInfo.CloseEnum(enumerationHandle);
The enumerationHandle parameter should be initialised to 0 before the first call to EnumTypeDefs and once you’re done with the enumerator pass it to the CloseEnum method to dispose of it.
The tokens parameter is the packed array which in the above example, would have room for five mdTypeDef tokens.
Tokens are all well and good, but they are hardly descriptive to anyone other than your IMetaDataImport reference. Other functions in the IMetaDataImport interface will allow you to fetch the name of the class, or find its ancestor class. But that is a story for another time.
