Sunday 25 October 2009

Dynamic Binding in .NET

What is the root of all evil in COM? Yes, I know this blog is about .NET, but this is relevant, so bear with me.

So what is the root of all evil in COM? The COM veterans will reply instantly: it is late binding. Why is that so? Well, the whole point about COM is interface programming. That is, a server says exactly what it can do which means that a client can be written to use that functionality reassured that it knows that the functionality will be present. Interface programming defines a contract, the server must abide by this contract and the client is reassured that the contract is followed to the letter. The really neat thing about this contract is that the client and server can be in different processes or on different machines and yet the client will still be able to call the server. This form of interface programming is called early binding. The compiler has information about the interface (usually in header files) and refuses to compile the code if the client attempts to call a method not present on the interface, or calls a method with the wrong number or wrong type of parameters.

The compiler. A wonderful piece of software. Not only does it do the translation from your high level language into something the computer can execute, but it also makes sure that what you have written is correct and it will minimise the errors that could occur at runtime. This improves your reputation as a developer, it reduces the time taken to test and debug code, and it makes your code more robust. All these benefits make your project manager and your customer happy. Wonderful!

But does this apply unnecessary restrictions on the developer? Not at all. If a COM developer decides that the next version of the object needs more functionality then COM provides a solution: add a new interface with the new functionality. There is one rule that no COM developer should ever break, and that is once an interface is published it never changes. This immutable aspect of interfaces is vital for COM.

However, the problem with interface programming was that some languages (dare I mention a language that has the word Basic in its name?) were not fully functional when it comes to COM. Further, interface programming requires discipline from the developer and some languages naturally attract the less disciplined developers. (Don't tell me that such developers are free-spirited, they are simply sloppy.) So to accommodate such languages and developers Microsoft relaxed the rules and created late binding. With late binding the developer writes code how they think the object should work. Such compilers have had their wings clipped and so only do basic checking (if I was such a compiler I would sulk in the corner at this point). Don't misunderstand me, the code may well work in practice, but no one knows for sure until the code is actually run.

As a professional developer I strive to write good quality code that fits the customer's spec as close as possible. To do this I have to test and debug the code extensively. That is my job. The compiler helps here because it performs checks on the code for possible issues, including type checks. When you use late binding the compiler performs fewer checks: it essentially ignores the types used through late binding. This means that there is a good possibility that there will be an error in the late binding code. So who experiences such errors? The answer is whoever it is that is running the code. If the developer is thorough then they will test, test, test, so that every possible runtime error is experienced. But can you, as the customer, be sure that the developer is thorough? The reason why I dislike late binding is that it raises the possibility - unacceptably, in my view - of the customer experiencing a runtime error. As a professional developer you should not be using your customers to test your code, but that is what happens when you use late binding.

The language war between C++ and VB3 had a skirmish over late binding. Generally, C++ developers preferred not to touch late binding and VB3 developers had no choice, they had to
, so they grew to love it (later versions of VB introduced early binding, but bad habits take a long time to die). C++ developers could write late binding COM code but it was tedious since it involved type library queries and IDispatch calls. OLE Automation is the main reason for late binding and the most prolific user of OLE Automation is Microsoft Office. The widespread use Office means that there is a demand for C++ developers to write late binding code and so eventually the Visual C++ libraries provided mechanisms to make early binding code easier to write (COleDispatchDriver, #import).

Let's return back to .NET. Type safety is very important from a security point of view, and .NET takes security seriously (well, we shall see whether this statement still applies to .NET 4, but that is an interesting post for the future). If the runtime detects that code calls a member on the object not present in its type information, or if the member does exist but the call uses the wrong number of parameters or the wrong parameter types, then the runtime throws an exception. As a developer you have to make sure that the customer never sees such exceptions, and it is better to write your code to be type safe than to catch type mismatch exceptions. The compiler works to help you write type safe code. If you really want to use late binding in .NET you can always use Reflection. With Reflection you can write code that checks the functionality of a class and set up the managed stack with the appropriate parameters. Calling code this way is tedious and it puts off a lot of developers. Good thing too. Type safety is vitally important, so every effort should be made to dissuade people from writing type unsafe code. But you can see that there is a parallel with C++ and type libraries: it is possible for .NET developers to write late binding code but since it takes a lot of work developers were dissuaded from doing something that had the potential of introducing difficult to detect runtime exceptions. Until now.

Now for the subject of this post. C# 4 will have a new keyword called dynamic. This brings late binding to C#. I'll repeat that: the dynamic keyword means that our nice type safe language, C#, has been VB3-ified. With this new keyword the developer can write code to call any method in any way they choose and the compiler will perform no type safety checks. Consequently, all type safety checks are left to the CLR runtime (which has magically been renamed DLR Dynamic Language Runtime), when running on the customer's machine, at which point it is too late for the developer to fix the code. This is exactly the case with late binding (OLE automation) in VB3, and will happen far to frequently soon with C#.

Do I sound disheartened? Yes, and the reason is that I am a COM consultant as well as a .NET consultant and I often in the past I was hired by companies to fix the problems they had with COM code. In almost all cases the reason for the issues was that the developers did not follow interface programming rules. Usually I was asked to fix code that was written in Visual Basic or written for Visual Basic, and I can ascribe a lot of the problems coming from developers with a late binding mindset writing interface code. The two do not mix. Interface programming is precise, late binding is sloppy. Yet now our beloved C# has been polluted with the a keyword that allows C# developers to write sloppy code.

Actually, it is not the keyword that I object to, it is the fact that the keyword is allowed by default. The C# compiler does not allow you to write .NET unsafe code unless you compile your assembly with the /unsafe switch. The same should be true with dynamic. If the developer has to use the /unsafe switch then he gets an immediate indication that the code he is writing is not as type safe as the code that C# usually generates. I hope that Microsoft makes this change to the compiler.

So how does this rather evil keyword work? Let's look at an example.
class Item
{
int t;
public int Tail { get { return t; } }
public Item(int i) { t = i; }
}

Item i = new Item(10);
Console.WriteLine(i.Tail);
object o = new Item(10);
Console.WriteLine(o.Tail); // Compiler error CS1061

In this code we create an object of type Item and initialise it with a value of 20. Then we print out the the value of the Tail property. The first pair of lines access the Item object through a typed variable, and so the compiler is happy with the line that accesses the Tail property. In the second pair the Item object is accessed through an object variable. The compiler enforces type safety and only allows the code to access Object members and since there is no Tail property on the Object class the compiler generates error CS1061. The solution, of course, is to cast the variable to the appropriate type:
Console.WriteLine((o as Item).Tail);

This code has its inherent dangers. The variable o may not be an object of type Item and hence the casting (in this case using the as operator) will fail at runtime. .NET developers know this and the better developers take steps to avoid writing untyped code like this. (This is one reason for generic code: it eliminates many of the reasons for writing untyped code). Now consider the following code:
Console.WriteLine((o as Item).Tai1);

This will generate an error of CS1061. Why? Well the reason is that I typed a number 1 instead of the lower case letter L in the name of the property Tail. The compiler warns me that my sloppy typing has genrated an error in the code. The compiler, of course, sees the character 0x31 rather than the character 0x6c, whereas my 46 year-old eyes can easily mistake a 1 for an l. It's a good thing that I have the C# compiler to help me, isn't it?

Now let's see how the dynamic keyword is used. The first point to make is that it should be treated a bit like the object keyword in that it can be assigned to any type:
dynamic d = new Item(10);
Console.WriteLine(d.Tail);

However, the interesting thing is that there is no type checking performed on the variable by the compiler. The code above causes dynamic checking at runtime that the variable d has a property called Tail, and in this case the code will compile and run with no errors. However, in the following code, I have mistyped a 1 for an l again (must get some new glasses...):
Console.WriteLine(d.Tai1);

This compiles fine because there are no compile time checks, but at runtime I get an exception:
Microsoft.CSharp.RuntimeBinder.RuntimeBinderException
was unhandled 'Dynamic.Item' does not contain
a definition for 'Tai1'

As a user of this code I have to contact the developer and tell him that he needs to get his eyes tested: the user is debugging the code.

So what is happening under the covers? Well, as you can imagine there is some reflection going on. The key to this code is a class called Binder in the Microsoft.CSharp.RuntimeBinder namespace. The compiler generates code to call methods on this class to invoke members of the dynamic object, and the Binder class returns (effectively) a delegate to the member. For example, to call the Tail property the compiler generates the following code:
CSharpArgumentInfo[] info = new CSharpArgumentInfo[] {
CSharpArgumentInfo.Create(
CSharpArgumentInfoFlags.None, null) };
CallSite<Func<CallSite, object, object>>
pTail =
CallSite<Func<CallSite, object, object>>
.Create(
Binder.GetMember(
CSharpBinderFlags.None, "Tail",
typeof(Program), info));

The Tail property is effectively a method called get_Tail that has no parameters and returns an integer, so the info array with the information about the parameters passed to the method has one entry that has null values (interestingly, you cannot pass null for this parameter, neither can you pass an array with zero members, you have to create one member that has these "null" values).

The Func<> delegate has three parameters, the first is the CallSite object that defines where the code is being executed (in this case the code is executing in a method of the Program class, see the third parameter of the GetMember method); the second parameter is used to supply the object to be called (Tail is an instance member) and the last parameter is the return type. Huh? The return type is object? It appears so. The CallSite object will only return an object and it is up to you to cast it.
object o = new Item(20);
int ret = (int)pTail.Target.Invoke(pTail, o);

I am not sure why you cannot give a return type other than object (and provide the type as a parameter when creating the CallSite). However, the Binder class does have a method to perform the casting for you:
CallSite<Func<CallSite, object, int>>
pConvert =
CallSite<Func<CallSite, object, int>>.Create(
Binder.Convert(
CSharpBinderFlags.None, typeof(int)));

CSharpArgumentInfo[] info = new CSharpArgumentInfo[] {
CSharpArgumentInfo.Create(
CSharpArgumentInfoFlags.None, null) };
CallSite<Func<CallSite, object, object>>
pTail =
CallSite<Func<CallSite, object, object>>.Create(
Binder.GetMember(
CSharpBinderFlags.None, "Tail",
typeof(Program), info));

object o = new Item(20);
int ret = pConvert.Target.Invoke(
pConvert, pTail.Target.Invoke(pTail, o));
Console.WriteLine(ret);

This code is rather complicated, but in effect it is the code for the following C#
dynamic o = new Item(20);
Console.WriteLine((int)o.Tail);

The Binder/CallSite code is reminiscent of the sort of code that the MFC Automation wizard or the #import directive creates. The code is generic and cumbersome and should be avoided when possible. No doubt by now you have seen several blogs saying "Wow, ain't this cool?" trying to persuade you that you can relax your C# code into sloppy VB3 code. Resist the urge.

But there must be a reason for dynamic. There is: calling OLE Automation code. In this case the keyword dramatically reduces the amount of code you have to write, and so I am happy for it to be used. I am just wary that the keyword can be used inappropriately. Hence we come full circle. My advice to you as a developer is to only ever use dynamic when you write OLE Automation COM interop code and you should take every step possible to avoid it at all other times. My plea to Microsoft is to change the C# compiler so that it only allows the use of dynamic when the /unsafe (or equivalent) switch is used. If Microsoft does not do this, then I expect in a year's time to be asked to fix other people's code and find that a large proportion comes from developers using the "cool" dynamic keyword inappropriately.

2 comments: