This is my personal blog. The views expressed on these pages are mine alone and not those of my employer.

Tuesday, 25 November 2014

MSTest Breaking Changes and the Legacy Lifesaver

Today I encountered an issue with a group of unit tests that were failing on the build server but running fine on my development machine.

The commonality between these tests is they were all making use of an attribute that extended ExceptionBaseAttribute defined in the UnitTesting namespace.  This extension compared the exception message, rather than just the exception type as provided by the vanilla attributes.

My development machine was running VS 2013 the build server VS 2012. After a bit of digging it was clear I was looking at a breaking change in the VS 2012 test runner.

So what could I do about it?
  • Upgrade the build server? Not viable
  • Ignore the tests? Plain stupid
  • Toggle exception attributes with preprocessor directives? Hacky

There was a fourth option.  MSTest legacy mode.

To enable it I first had to create a solution level file with a .runsettings extension, of which I placed the following contents:


This was then checked in and referenced in the build definition under Automated Tests -> Test Source -> Run Settings:


From this point onwards all tests were running as green as ever.

Perhaps this helped you?  Have you experienced any MSTest breaking changes?


Saturday, 8 November 2014

Solving the Software Estimation Enigma



Software developers are pants at estimation
M. Edmondson

There, I said it.  You and I both know it. It's one of the hardest tasks we face as a software developer. But guess what, there is a way that's worked for me and it's backed up by the likes of 'uncle' Bob and others.

First and foremost it's important to understand an estimate is not a single number. An estimate is a distribution. This means your estimate will be a range of numbers in which the task will complete. Think 6 hours, plus or minus 2.

The reason for this is that you'll always have to make assumptions at some level.  A range takes into account unknowns such as a web service being down, or finding you need to modify a method which uses a wretched spaghetti of reference values or [add your developer woe here].

So how do you create such a estimate? First break down the problem into the smallest feasible chunks of work.  Try to get as granular as you can, but don't worry too much if you're struggling.

Then decide the following 3 timings to complete the task:
  • b = best case
  • w = worst case
  • m = most likely case (i.e.what your estimate would've been before you read this)
b and m are complete edge cases, a one in a lifetime event.  For the maths to work these must have less than 1% chance of actually occurring.

Then simply plug them into the following equation to get the first part:

e = (b + 4m + w) / 6
then:

v = (w - b) / 6
These two sets of magical numbers let you know:
  • e, how long the task will take
  • v, the variation on this number
You therefore tell the project manager / business the task will take:
"e plus or minus v"
This has rewarded me with surprising accuracy whilst also providing the benefit of communicating your certainty (v).

Do you agree?  How do you solve your estimation enigma?

Monday, 15 September 2014

Pretentious Parameters and the C# Compiler

I was reading through Jon Skeet's brilliant C# In Depth when I came across a thought provoking shred of info tucked away in Part 4.

Jon was describing optional parameters, how they've been supported in the CLR from .NET 1.0 and the motivation behind their inclusion in C# 4. Along with this he provided an example which exhibits behaviour you probably won't expect.

First create a class library in which you define an optional parameter:

then reference it from a separate project:


Fire up your favourite IL decompiler and take a look at Program


Program has taken the value 20 directly from LibraryDemo and assimilated it as it's own. What the hell?

I'm sure you're already picturing the issues this can lead to.  If you were to modify LibraryDemo and neglect to recompile Program you will still get 20, which, believe me, will lead to bugs that are damn hard to track down.

This does at least explain why the requirements for optional parameters are as so:
10.6.1
The expression in a default-argument must be one of the following:
  • a constant-expression 
  • a new expression of the form new S() where S is a value type
  • an expression of the form default(S) where S is a value type
All of these expressions are known at compile time allowing the compiler to stuff whatever you specify into other classes / projects / trash can.

However the reality is we don't care what the compiler is doing, since this is an example of an implementation detail, of which I've already taught you not to rely on. Therefore I'm quite happy to place this as a simple curiosity, which may well help save a colleagues sanity in the near-future.

...and anyway, when do you ever build a single project without building the whole solution?

Wednesday, 27 August 2014

Why LINQ requires you to Func?

When LINQ appeared on our screens it brought along a requirement under the guise of 'Func' whenever you wanted to do anything substantial, such as supply the contents of a where clause:


What exactly is Func? What are we actually being asked for here?  This is a journey that begins with delegates.

First let's consider how we create an object:


As you can see this takes three steps:

  • We define the a class (Car)
  • We create a variable of that type
  • Then we create an instance of the class and set the variable to a reference of it

A bit simple? Stick with me...

Now lets see how we do this with delegates.  I invite you to spot the difference:



In both instances Car acted as a pointer to some functionality, the first was a reference to an instance of a class whereas the second was reference to a method.  You've spotted the difference.

Class type = Reference to a object
Delegate type = Reference to a method

Directly from the C# Spec:
1.11 Delegates

A delegate type represents references to methods with a particular parameter list and return type. Delegates make it possible to treat methods as entities that can be assigned to variables and passed as parameters.

That last sentence is extremely important.  Delegates allow us to pass references to methods in the same way we can pass references to objects.

This is the purpose of Func.  Instead of being forced to define our own delegate (what we did in the second example, named Car) we're provided with a definition of a delegate of which we have to provide a suitable reference.

Let me hammer this home:
Usually we call methods which a reference (or value of) an object, instead the Where method is requesting a reference to a method which is the same type of Func.
Lets see a concrete example:

Scenario: Select all names beginning with M

To do this I need to create a method that tests a given item, and return true if it begins with M:



However because of these features introduced in (introduced in C# 2*):

- Removing the awkward delegate syntax
- Anonymous methods, allowing you to define a delegate instance's action in-line

we can supply it in a way I suspect you're familiar with:


* Surprising isn't it, this has been available since 2005!

Hopefully you can see that the only difference is that we've compressed the method definition, moving towards a more fluid, expressive form of programming.  Which, in conjunction with anonymous methods is the only purpose of Func.

There are 17 varieties of Func, each one specifying a different amount of parameters to enable you to pick the right one for your task.  There is also a sibling Action which has the same purpose expect for not returning a value.

So...why not have a glance over your codebase? How many times have you used Func without giving it a second thought?  Even better, how could you produce your own methods that accept functionality as a parameter in the form of Func?

Thursday, 17 July 2014

Unassuming Unicode, the secret to characters on the web

Recently I got an e-mail with an interesting title:

How did they do that?

Just how did KLM insert an airplane into the subject of an e-mail? Unicode!

I needn't put a full description here, but unicode is the system that provides a unique identifier for every single character your computer is capable of displaying.  Yes Chinese, Yiddish, Maldivian, Airplane symbols, the lot!

So what does this look like under the hood?

To find out I copied the character into Notepad and saved it, ensuring I selected 'Unicode' as the encoding at the bottom of the 'Save As' dialog.


Then I viewed the raw binary of the file in a hex editor (I just happened to pick this online one).  The results were simply:

FF FE 08 27

What we're seeing here is the hexadecimal representation of the binary in the file.  You can confirm this using windows calculator in programming mode but for simplicity this is:

FF     11111111
FE     11111110
08     00001000
27     00100111

The first two bytes are telling us that is little-endian UTF-16, these are the byte order mark (BOM).  Endian (or endianness) simply tells us from which end we read the data first, which in this case means we read from right to left.

So doing this we now have (omitting the byte order marks):

27 08
Which just so happens to the unique identifier for the airplane symbol:



But why do you care about this?  You could've just copied and pasted the original symbol, right?

Well it just so happens that HTML encoding closely follows these unicode code points.  So if I wanted to use this character myself I'd want to be absolutely certain it'll render correctly.

To do this I'd first make sure my page is described as being encoded in unicode using the correct meta tag:
<meta charset="utf-8">
Then I can create the character using &#xnnnn where nnnnn is the unicode code point.  Therefore &#x2708 creates our airplane:


That's just one.  There are 109, 383 other characters out there, go and use 'em.

Saturday, 7 June 2014

Keeping your source, safe

Too many times now have I seen a fear of committing code, with many developers waiting until they are absolutely certain their code is damn near perfect before hitting commit.  I blame the terminology, commit sounds so final and carrying reputation consequences.  That's why I prefer to call them checkpoints:


A checkpoint is a point in time that you can return to - no matter what happens:

- Your hard drive fails
- You find yourself needing to backtrack
- You take a holiday
- You lose a 'life'

The more checkpoints you have, the more choice you're giving yourself in the future to return to.

That's why I advocate of checking your code in early and often. Don't worry if it's a work in progress, there are missing tests, it's not perfect. Check it in!

Of course I'm not advocating checking in crap, meaning there has to be rules:

- It should compile
- All tests pass
- You keep it on your own branch
- You include any new code since the last commit
- A commit message is nice (although not mandatory for every commit)

These are just common courtesy to your follow developers meaning they'll be able to pick up from where you left off for whatever reason.

Using source control like this keeps your code safe, provides an audit trail, and allows others to see your work.

Therefore I urge you commit often after all it's your branch.

Thursday, 27 March 2014

Reliance on implementation details

Recently I stumbled across an issue in a legacy vb.net app which didn't appear to make any sense.  The issue involved determining the precision of a Decimal which was giving different results for exactly the same value.

First of all I wrote a quick test to attempt to replicate the problem, which appeared to happen for 0.01:


This passed, then I'd noticed in a particular method call the signature was expecting a Decimal, but was instead being supplied a Float (yes option strict was off [1]), meaning the Float was being implicitly converted. Quickly writing a test incorporating the conversion:


Causes the issue:


It seems to think 0.01 is to 3 decimal places!

So what's going on here? How can a conversion affect the result of Precison()? Looking at the implementation I could see it was relying on the individual bits the Decimal is made up from, using Decimal.GetBits() to access them:


The result of Decimal.GetBits() is a 4 element array, of which the first 3 elements represent the bits that go to make up the value of Decimal.  However this method relies only on the fourth set of bits - which represents the exponent. In the first test the decimal value was 1 with exponent 131072, the failed test had 10 and 196608.

When converting to binary we see the difference more clearly, I've named them bitsSingle for the failed test and bitsDecimal for the passing test:


As you can see the exponent for bitsSingle is 3 (00000011) whereas the exponent for bitsDecimal is 2 (00000010), which represent negative powers of 10.

Looking back at the original numbers we can see how these both accurately represent 0.01:

bitsSingle has a value of 10, with an exponent of -3 = 10 -3
bitsDecimal has a value of 1, with an exponent of -2 = 10 -2

As you can see Decimal can represent the same value even though the underlying data differs. Precision() is only relying on the exponent and ignoring the value, meaning it's not taking into account the full picture.

But why is the conversion storing this number differently than when instantiated directly?  It just so happens that creating a new Decimal (which uses the Decimal constructor) uses a slightly different logic than that of the cast. So even though the number is correct, the underlying data is slightly different.

This brings us to the point of the article.  The big picture here is to remember that you should never rely on implementation details, rather only what can be accessed through defined interfaces.  Whether that be a webservice, reflection on a class, or peeking into the individual bits of a datatype.  Implementation details can not only change, but in the world of software - are expected to.

If you want to play around with the examples above I've uploaded them to GitHub.

[1]I know it's not okay and there isn't a single reason for this, however as usual with a legacy app we simply don't have the time / money to explicitly convert every single type in a 20,000 + loc project.