Showing posts with label metaprogramming. Show all posts
Showing posts with label metaprogramming. Show all posts

Friday, 17 August 2012

Taking meta-programming beyond crazy

Firstly, a brief note that protobuf-net is up to r580 now, on both nuget and google-code; mainly small tweaks while I build up enough energy to tackle a few larger pieces (some knotty interface / dynamic / base-class improvements are next on the list)

Pushing both ends

Aimed to time with the .NET 4.5 release (and perhaps more notably, the .NETCore and .NETPortable profiles), I’ve recently spent a lot of time on meta-programming, culminating in the new precompiler that allows these slimmed down and highly restrictive frameworks to still have fast serialization (static IL, etc).

While there, I set myself a silly stretch goal; for the main purpose simply of to see if I could do it – which was: to get the whole shabang working on .NET 1.1 too. This would give me a fair claim to supporting the entire .NET framework. So, for a bit of reminiscing – what does that need?

Generics

Generics were introduced in .NET 2. v1 of protobuf-net made massive usage of generics; so much so that it actually killed the runtime on some platforms (see here, here and here). So removing most of the generics was already a primary design goal in v2.

Perhaps the most significant problem I hit here was trying to decide on a core collection type for the internal state. As it turns out, there’s no free lunch here; there is no collection type that is common to all frameworks – some don’t have ArrayList. In the end, I wrote my own simple collection – not just for this, but also because I wanted a collection that was thread-safe for iterations competing with appends (iterators only read what existed when it started iterators).

Language

You’d be amazed what you miss when you try to design something to compile on down-level compilers. For a dare, go into project-properties, the “Build” tab, and click “Advanced…” – and change the “Language Version” to something like ISO-1 (C# 1.2) or ISO-2 (C# 2.0) – see what breaks. Obviously you expect generics to disappear, but you also lose partial methods, partial classes, iterator blocks, lambdas, extension methods, null-coalescing, static classes, etc – and just some technically legal syntax that the early compilers simply struggle with. protobuf-net is configured to build in ISO-2 in the IDE, but with #if-regions to rip out the last few generics in the .NET 1.1 build. Writing iterator blocks… not fun.

Framework

There’s a silly number of variances in the core BCL between different frameworks; even things like string.IsNullOrEmpty or StringBuilder.AppendLine() aren’t all-encompassing. I ended up with a utility class with a decent number of methods to hide the differences (behind yet more #if-regions). But by far the craziest problem: reflection. And protobuf-net, at least in “Full” mode (see here for an overview of “CoreOnly” vs “Full”), uses plenty of reflection. Oddly enough, the reflection in .NET 1.1 isn’t bad – sure, it would be nice to have DynamicMethod, but I can live without it. Getting this working on .NET 1.1 was painless compared to .NETCore.

Aside / rant: how much do I hate “.GetTypeInfo()” on .NETCore? With the fiery rage of 2 stars slowly crashing into each-other. Oh, I’m sure that the differences to Type / TypeInfo make perfect sense for application-developers in .NETCore, who probably should be limiting their use of reflection, but for library authors: this change really, really hurts. The one things that lets me keep civil about this change is that in “CoreOnly” + “precompiler” we do all the reflection work up-front using the regular reflection API, so for me at least most of this ugly is just a cruel artefact. But still: grrrrrrrrrrrrrr.

Opcodes

There are a number of opcodes that simply don’t exist back on 1.1; if I’ve done my compare correctly, this is: Unbox_Any, Readonly, Constrained, Ldelem and Stelem. The good news is that most of these exist only to support generics, and are pretty easy to substitute if you know that you aren’t dealing with generics.

Metadata Version

.NET 1.1 uses an earlier version of the metadata packaging format than all the others use. This is yet another thing that the inbuilt Reflection.Emit can’t help, but - my new favorite metaprogramming tool to the rescue: IKVM.Reflection supports this. I have to offer yet another thanks to Jeroen Frijters who showed me the correct incantations to make things happy: beyond the basics of IKVM, the key part here is a voodoo call to IKVM’s implementation of AssemblyBuilder:

asm.__SetImageRuntimeVersion("v1.1.4322", 0x10000);

The 0x10000 here is a magic number that specifies the .NET 1.1 metadata format. For reference, 0x20000 is the version you want the rest of the time. As always, IKVM.Reflection seems to have considered everything; it is the gold standard of assembly writing tools. Awesome job, Jeroen. I jokingly half-expect to find that Roslyn has a a reference to IKVM.Reflection ;p

Putting the pieces together

But! Once you’ve dealt with all those trivial problems; it works. I’m happy to say that protobuf-net now has “CoreOnly” and “Full” builds, and support from “precompiler”. So if you still have .NET 1.1 applications (and I promise not to judge you… much), you can now use protobuf-net with as many optimizations as it is capable of. Which is cute:

C:\SomePath>AnotherPath\precompile.exe Net11_Poco.dll -o:MyCrazy.dll -t:MySerializer

protobuf-net pre-compiler
Detected framework: C:\Windows\Microsoft.NET\Framework\v1.1.4322
Resolved C:\Windows\Microsoft.NET\Framework\v1.1.4322\mscorlib.dll
Resolved C:\Windows\Microsoft.NET\Framework\v1.1.4322\System.dll
Resolved protobuf-net.dll
Adding DAL.DatabaseCompat...
Resolved C:\Windows\Microsoft.NET\Framework\v1.1.4322\System.Xml.dll
Adding DAL.DatabaseCompatRem...
Adding DAL.OrderCompat...
Adding DAL.OrderLineCompat...
Adding DAL.VariousFieldTypes...
Compiling MySerializer to MyCrazy.dll...
All done

C:\SomePath>peverify MyCrazy.dll

Microsoft (R) .NET Framework PE Verifier Version 1.1.4322.573
Copyright (C) Microsoft Corporation 1998-2002. All rights reserved.

All Classes and Methods in MyCrazy.dll Verified

In case it isn’t obvious, “Net11_Poco.dll” is a .NET 1.1 dll created in Visual Studio 2003; “precompiler” has then detected the 1.1-ness, bound IKVM to the .NET 1.1 framework, and compiled a protobuf-net custom serializer for that model, as a legal .NET 1.1 dll.

Questionable sanity

Another way of reading all this is: I’ve possibly now crossed the line between “eccentric” and “batshit crazy”. I don’t have a need to use .NET 1.1, but I would be overjoyed if someone else gets some genuine usage out of this. Mainly, I just wanted to learn some things, challenge myself, and take a bit of professional pride in doing something fully and properly – just because: I can.

Monday, 16 July 2012

Introducing the protobuf-net precompiler

Over the last few posts (here and here) I’ve given a few hints as to compiling for other platforms. Basically, this all relates to how well protobuf-net works on platforms like iOS, WinRT/Metro, Silverlight, Phone 7, etc. These heavily restricted runtimes don’t allow much meta-programming, and they might be running on low-power CPUs, so reflection (even if possible) is not ideal.

I’ve played with assembly generation before, but with mixed results. For example, here for Phone 7. This could just about work for some frameworks, but was patchy on some, and won’t work at all for others.

Well, all that IKVM shininess has opened up a whole new set of tools. The small beauty that I’m ridiculously pleased with is a new utility exe in the SVN trunk (will be part of a proper release soon): precompile.

This unassuming little tool works a bit like “sgen”, but with the ability to target multiple frameworks. What it does is:

  • inspect the input assembly (or assemblies) to resolve (if it can) the target framework
  • initialize an IKVM universe targeting that framework
  • load the core framework libraries, protobuf-net, and the input assemblies into the IKVM universe
  • scan the input assemblies for types marked [ProtoContract]
  • add those to a protobuf-net model
  • compile the model to a type / assembly of your choosing

To use that in a project you might:

  • create a new DTO project in your chosen framework, and compile it
  • execute precompile to generate to a serializer assembly
  • from your application project, reference the DTO and serializer assemblies
  • use the type you created

For example, say I create a new Phone 7 DTO assembly, called PhoneDto (because it is late and I lack imagination). I can then create a serialization assembly via:

precompile {some path}\PhoneDto.dll –o:PhoneSerializer.dll –t:MySerializer

This will generate a library called PhoneSerializer.dll, which you can reference from your main project (in addition to the DTO and the protobuf-net core).

Then, just use MySerializer:

var ser = new MySerializer();
ser.Serialize(output, obj);

I hope this finally solves a number of tooling issues. I’m pretty pleased with it. I’ve tested it against a range of different frameworks, and it has worked well – but if you get problems, just let me know (comment here, or email me, or log an issue on the protobuf-net project site).

Saturday, 14 July 2012

Enter the IKVM

aka Meta-programming and Metro for .NET

In my previous blog entry I gave an overview of how protobuf-net is arranged internally, and hinted that it all falls apart for Metro. So: what goes wrong? Currently, protobuf-net has a lot (and I do mean a lot) of code built on top of ILGenerator, the reflection class that underpins meta-programming. This class is great for hardcore library builders: you can build individual methods (via DynamicMethod) or entire assemblies (via AssemblyBuilder). It is very low level, since you are writing raw IL – but I’m not completely crazy, so I wrap that up in some utility methods. There are a few problem., though:

  • ILGenerator usually expects be be working on the current framework
  • Even though you can use a “reflection only load”, this still fails horribly for System.Runtime.dll (aka Metro for .NET)
  • In early versions of .NET (and I try to support .NET 2 and upwards), the ability to properly inspect attribute data against reflection-only types/members (GetCustomAttributesData) is simply missing

So… basically, it all falls to pieces for generating as assembly targeting Metro for .NET, based on a Metro for .NET input assembly.

What I would need is some kind of utility that is:

  • Broadly similar to Reflection.Emit, so I don’t have to rewrite a few thousand lines of code (since I want to keep that code for the runtime meta-programming on full .NET)
  • Able to load and work with alternative frameworks without getting confused
  • Able to give full attribute information, even on .NET 2
  • Inbuilt or free, to match the existing license

Who would go to the trouble of writing such a thing?

Maybe somebody who is already writing compiler-like tools that aren’t tied to a specific framework, and who has run into the existing limitations of Reflection.Emit? Maybe somebody writing a Java/.NET bridge? Like IKVM.NET ?

Actually, for my purposes I don’t need most of IKVM. I just need one tiny piece of it: IKVM.Reflection. This library was specifically written to mimic the existing Reflection.Emit API, but without all the problems. In particular, it has a Universe class that is a bit like an AppDomain; you load assemblies into a Universe, and that is what is used for resolution etc after that. Here’s a simple example – which will immediately make sense to anyone used to Reflection.Emit – or another example specifically targeting Metro for .NET. The nice thing about the API is that mostly I can change between IKVM and Reflection.Emit via a simple:

#if FEAT_IKVM
using Type = IKVM.Reflection.Type;
using IKVM.Reflection;
#else
using System.Reflection;
#endif

OK, I would be grossly exaggerating if I claimed that was the only change I had to make, but that’s the most demonstrable bit. What this means is that just for my cross-platform compiler, everything in the Model, Strategy and Compiler modules (see previous blog) switches to IKVM terminology. The Core still uses System terminology, and the Runtime doesn’t apply (I obviously can’t instantiate/execute types/methods from another framework, even if I can inspect them).

With this technique, I can now successfully load a Metro for .NET assembly (such as a DTO), and generate a fully static-compiled assembly (“Standalone” on the diagram) targeting Metro for .NET. No reflection at runtime, no nasty hacks – just: an assembly that wasn’t generated by the MS tools.

I’m still working on turning this into a slicker tool (comparable to, say, SGEN), but a working illustration is in the repo; in particular, see MetroDTO (some sample DTOs), TestIkvm (my proof-of-concept compiler hard-coded to MetroDTO), and Metro_DevRig (which shows the generated assembly working in a Metro for .NET application, including performance comparisons to XmlSerializer and DataContractSerializer).

Additionally, it seems extremely likely that the same tool should be able to write platform-targeted assemblies for Silverlight, Phone 7, XNA, CF, etc. Which is nice.

Thanks and Acknowledgements

I’m hugely indebted to Jeroen Frijters in all of this; both for providing IKVM.Reflection, and for his direct and very prompt assistance when playing with all the above. Mainly for correcting my own brain-dead mistakes, but also for a very quick bug-fix when needed. The library is awesome, thanks.

(as a small caveat, note that protobuf-net is currently exposing a locally built version of IKVM.Reflection; this will be rectified after the next dev build of IKVM)

Thursday, 12 January 2012

Playing with your member

(and: introducing FastMember)

Toying with members. We all do it. Some do it slow, some do it fast.

I am of course talking about the type of flexible member access that you need regularly in data-binding, materialization, and serialization code – and various other utility code.

Background

Here’s standard member access:

Foo obj = GetStaticTypedFoo();
obj.Bar = "abc";

Not very exciting, is it? Traditional static-typed C# is very efficient here when everything is known at compile-time. With C# 4.0, we also get nice support for when the target is not known at compile time:

dynamic obj = GetDynamicFoo();
obj.Bar = "abc";

Looks much the same, eh? But what about when the member is not known? What we can’t do is:

dynamic obj = GetStaticTypedFoo();
string propName = "Bar";
obj.propName = "abc"; // does not do what we intended!

So, we find ourselves in the realm of reflection. And as everyone knows, reflection is slooooooooow. Or at least, it is normally; if you don’t object to talking with Cthulhu you can get into the exciting realms of meta-programming with tools like Expression or ILGenerator – but most people like keeping hold of their sanity, so… what to do?

Middle-ground

A few years ago, I threw together HyperDescriptor; this is a custom implementation of the System.ComponentModel representation of properties, but using some IL instead of reflection – significantly faster. It is a good tool – a worthy tool; but… I just can’t get excited about it now, for various reasons, but perhaps most importantly:

  • the weirdness that is System.ComponentModel is slowly fading away into obscurity
  • it does not really address the DLR

Additionally, I’ve seen a few bug reports since 4.0, and frankly I’m not sure it is quite the right tool now. Fixing it is sometimes a bad thing.

Having written tools like dapper-dot-net and protobuf-net, my joy of meta-programming has grown. Time to start afresh!

FastMember

So with gleaming eyes and a bottle of Chilean to keep the evil out, I whacked together a fresh library; FastMember – available on google-code and nuget. It isn’t very big, or very complex – it simply aims to solve two scenarios:

  • reading and writing properties and fields (known by name at runtime) on a set of homogeneous (i.e. groups of the same type) objects
  • reading and writing properties and fields (known by name at runtime) on an individual object, which might by a DLR object

Here’s some typical usage (EDITED - API changes):

var accessor = TypeAccessor.Create(type);
string propName = // something known only at runtime
while( /* some loop of data */ ) {
accessor[obj, propName] = rowValue;
}

or:

// could be static or DLR
var wrapped = ObjectAccessor.Create(obj);
string propName = // something known only at runtime
Console.WriteLine(wrapped[propName]);

Nothing hugely exciting, but it comes up often enough (especially with the DLR aspect) to be worth putting somewhere reusable. It might also serve as a small but complete example for either meta-programming (ILGenerator etc), or manual DLR programming (CallSite etc).

Mary Mary quite contrary, how does your member perform?

So let’s roll some numbers; I’m bundling read and write together here for brevity, but - based on 1M reads and 1M writes of a class with an auto-implemented string property:

Static C#: 14ms
Dynamic C#: 268ms
PropertyInfo: 8879ms
PropertyDescriptor: 12847ms
TypeAccessor.Create: 73ms
ObjectAccessor.Create: 92ms

As you can see, it somewhat stomps on both reflection (PropertyInfo) and System.ComponentModel (PropertyDescriptor), and isn't very far from static-typed C#. Furthermore, both APIs work (as mentioned) with DLR types, which is cute - becaues frankly they are a pain to talk to manually. It also supports fields (vs. properties) and structs (vs. classes, although only for read operations).

That's all; I had some fun writing it; I hope some folks get some use out of it.

Monday, 15 March 2010

When is an int[] not an int[]?

I’ve spent my entire train journey trying to get to the bottom of this, so I thought I'd blog it for posterity. In my crazed Reflection.Emit frenzy, my unit tests were erroring with PEVerify complaining about illegal ldlen codes:

[offset 0x....] Expected single-dimension zero-based array.

If you're doing meta-programming, tools like PEVerify and Reflector are your closest allies, but this took some head-scratching. I even distilled the code down to two seemingly identical bits of code that read and discard the length of an array variable initialized to null:

imageimage

The first pane declares “loc 0” and “loc 2” as a local int[] variables; forget about “loc 1” – it is unrelated. The second pane initializes each array variable as a null reference, obtains the length (which is a “native int” which I immediately convert to Int32), and then discards the value.

So why the error? And why one error and not two? PEVerify is, after all, a chatty beast… Either I’ve gone crazy in my code, or somebody is lying to me! Actually, both it turns out.

Pop quiz: what is the difference between these two Type instances representing a 1-dimension array of int:

Type explicitRank = typeof(int).MakeArrayType(1),
implicitRank = typeof(int).MakeArrayType();

The second is our friend, int[]. The first is something different, though; it is a 1-dimensional array of int sure enough, but it isn’t explicitly zero-based! (correction due: see comments) D’oh! It goes by the moniker int[*].

Simply; you can’t use ldlen on an int[*] – only an int[]. What I don’t yet understand is why the upstream code (when it assigned the array “for real”) didn’t complain about the very attempt to assign an int[] value (from a standard “get” accessor) to an int[*] local variable. Presumably the PEVerify authors didn’t think anyone would be stupid enough to try ;-p

The moral here; sometimes it pays to be less explicit (and I don’t just mean the language I used when I found the problem). I’ve also left feedback with Red Gate to tweak how it displays, but to be honest the number of people this cosmetic glitch will affect is minimal.