Thursday, 27 March 2014

Reliance on implementation details

Recently I stumbled across an issue in a legacy vb.net app which didn't appear to make any sense.  The issue involved determining the precision of a Decimal which was giving different results for exactly the same value.

First of all I wrote a quick test to attempt to replicate the problem, which appeared to happen for 0.01:

    private decimal expectedDecimalPlaces = 2;

    [TestMethod]
    public void Test2DecimalPoint_WithDecimal_ExpectSuccess()
    {
        decimal i = 0.01m;
        int actual = Program.Precision(i);
        Assert.AreEqual(actual, expectedDecimalPlaces);
    }

This passed, then I'd noticed in a particular method call the signature was expecting a Decimal, but was instead being supplied a Float (yes option strict was off [1]), meaning the Float was being implicitly converted. Quickly writing a test incorporating the conversion:

    private decimal expectedDecimalPlaces = 2;

    [TestMethod]
    public void Test2DecimalPoint_CastFromFloat_ExpectSuccess()
    {
        float i = 0.01f;
        int actual = Program.Precision((decimal)i);
        Assert.AreEqual(actual, expectedDecimalPlaces);
    }

Causes the issue:


It seems to think 0.01 is to 3 decimal places!

So what's going on here? How can a conversion affect the result of Precison()? Looking at the implementation I could see it was relying on the individual bits the Decimal is made up from, using Decimal.GetBits() to access them:

    public static int Precision(Decimal number)
    {
        int bits = Decimal.GetBits(number)[3];
        var bytes = BitConverter.GetBytes(bits)[2];

        return bytes;
    }

The result of Decimal.GetBits() is a 4 element array, of which the first 3 elements represent the bits that go to make up the value of Decimal.  However this method relies only on the fourth set of bits - which represents the exponent. In the first test the decimal value was 1 with exponent 131072, the failed test had 10 and 196608.

When converting to binary we see the difference more clearly, I've named them bitsSingle for the failed test and bitsDecimal for the passing test:

bitsSingle     00000000 00000011 00000000 00000000
               |\-----/ \------/ \---------------/
               |   |       |             |
        sign <-+ unused exponent       unused
               |   |       |             |
               |/-----\ /------\ /---------------\
bitsDecimal    00000000 00000010 00000000 00000000

NOTE: exponent represents multiplication by negative power of 10

As you can see the exponent for bitsSingle is 3 (00000011) whereas the exponent for bitsDecimal is 2 (00000010), which represent negative powers of 10.

Looking back at the original numbers we can see how these both accurately represent 0.01:

bitsSingle has a value of 10, with an exponent of -3 = 10 -3
bitsDecimal has a value of 1, with an exponent of -2 = 10 -2

As you can see Decimal can represent the same value even though the underlying data differs. Precision() is only relying on the exponent and ignoring the value, meaning it's not taking into account the full picture.

But why is the conversion storing this number differently than when instantiated directly?  It just so happens that creating a new Decimal (which uses the Decimal constructor) uses a slightly different logic than that of the cast. So even though the number is correct, the underlying data is slightly different.

This brings us to the point of the article.  The big picture here is to remember that you should never rely on implementation details, rather only what can be accessed through defined interfaces.  Whether that be a webservice, reflection on a class, or peeking into the individual bits of a datatype.  Implementation details can not only change, but in the world of software - are expected to.

If you want to play around with the examples above I've uploaded them to GitHub.

[1]I know it's not okay and there isn't a single reason for this, however as usual with a legacy app we simply don't have the time / money to explicitly convert every single type in a 20,000 + loc project.

Wednesday, 18 December 2013

Highlights of the year (literally)

As the end of the year approaches, I thought it'd be prudent to make a list of all nuggets of advice and insight I've read this year:

Effective Programming: More Than Writing Code (Jeff Atwood)

It’s amazing how much you find you don’t know when you try to explain something in detail to someone else. It can start a whole new process of discovery.
There's no question that, for whatever time budget you have, you will end up with better software by releasing as early as practically possible, and then spending the rest of your time iterating rapidly based on real-world feedback. So trust me on this one: even if version 1 sucks, ship it anyway. 

Lehman's laws of software evolution
As an evolving program is continually changed, its complexity, reflecting deteriorating structure, increases unless work is done to maintain or reduce it.

Scrum: A breathtakingly Brief and Agile Introduction (Chris Sims, Hillary Louise Johnson)
The daily scrum should always be held to no more than 15 minutes.

ReadWrite.com (Matt Asay)
Oracle has never been particularly community-friendly. Even the users that feed it billions in sales every quarter don't particularly love it.

The Art of Unit Testing: with Examples in .NET (Roy Osherove)
Finally, as a friend once said, a good bottle of vodka never hurts when dealing with legacy code.

Thursday, 17 October 2013

Recently I had the need to decode a Base64 string and make a PDF of it.  Usually I would've written a small utility app, but this time I rolled with powershell:

function decodeBase64IntoPdf([string]$base64EncodedString)
{
    $bytes = [System.Convert]::FromBase64String($base64EncodedString)
    [IO.File]::WriteAllBytes("C:\Users\medmondson\Desktop\file.pdf", $bytes)
}

I'm impressed with how quickly I can knock out a script like this (yes they are .NET assemblies) without having to load a new VS solution. Of course a lot more could be done to this (file format via an argument for example) but I thought I'd share it raw as I know I'll need to use it again one day.

Monday, 26 August 2013

The myth of software development

When you're developing software, have you ever thought "once this feature is complete I'll be done"? I'm the first to admit that there is always an end point in sight, believing once I've reached it I'll be able to say I'm finished.

Well guess what... software can never be considered finished, don't believe me?  Then why is Windows XP still being updated almost 12 years after its initial release?

Psychologically a lot of people compare a software project with more traditional types projects such as construction, however they are completely incomparable:

  • Software only ever reaches a state of acceptable functionality
  • Software is infinitely malleable meaning it can never reach a state of 'done'

Both of these reasons, in the same way as proving they aren't comparable to construction, show that starting a software project again is very rarely the right choice - instead adapt the software into the new state of acceptable functionality.

This is because software is the cumulative sum of all previous work, even reasonably small products will be the culmination of many man years.  In addition users understand how it works and all of the quirks of the features, including how to use them to the organisations advantage.

Therefore no matter how much spaghetti, ill named and awkward that legacy project is, it is almost never the right decision to start again from scratch.

Which is exactly why code needs to be maintainable, because you'll almost certainly won't be the only person who has to look after it.  Using tools such as resharper can help with this, and great to transform a spaghetti-ridden legacy project (and you may even manage to get some unit test coverage!) into something you can work with.

Therefore next time you want to start again from scratch think very carefully, as it's almost never the right choice.

Tuesday, 26 February 2013

Overview of type suffixes


I'd like to bring your attention to an area of the C# specification which is misunderstood by many:

Type Suffixes


Type suffixes are individual characters that you can append to 'any code representation of a value' (called a literal in .NET) which allows you to specify it's exact type.  They only relate to numbers - of which can be defined as one of two forms:

- Integer literals (Whole numbers)
- Real literals (More precision)

If you type 10 into your source code, the compiler will automatically interpret that as an integer type, however if you were to type 10.1 this would be automatically interpreted as a real type (because of the decimal point - the full rules are in the C# specification).  To demonstrate this I'll use the var keyword, which assumes a type based on it's initial assignment:


however if I type 10.1 I get a double (the default for a real literal):



Type suffixes allow you to override these defaults.

For example what if I wanted to specify a float?  Well it turns out there's a type suffix for this, f (single is a synonym for float):

and similarly decimal has m:


The point here is that the literal's type is defined the moment you enter it into the source code and not by the variable you are assigning it to. This becomes important in the scenario when you want to define a type where there isn't an implicit conversion available between the default type, and the variable being defined:


Here you're essentially attempting to store a number inside a box that's too small (usually referred to as a narrowing conversion).  To get around this you need to tell the compiler you actually wanted a decimal:


I understand this topic is somewhat basic, but I believe deserved an overview nonetheless.

Sunday, 14 October 2012

Using tnsnames.ora with your .NET application


Anyone attempting to get their application communicating with an Oracle database will have had to deal with the crucial tnsnames.ora and the information it contains.  And I'll also bet anyone doing this will have seen this beautifully helpful message:

ORA-12154: TNS:could not resolve the connect identifier specified


The reason for this message is that your application cannot find the information it needs to connect to the database, which is usually contained within the tnsnames.ora file (which could be in a number of locations).  The process is analogous to a DNS lookup which resolves a domain name into an IP address, but instead you are resolving an alias into database addresses.

However even if your tnsname.ora contains the correct information, you may still get this error if the application cannot locate the file containing the appropriate identifier.  Of course there are tools such as Tnsping, but I've found its help to be rather limited.

However what isn't very well known is that..

Your .NET application will use the tnsnames.ora file located in the same folder as the executable


As far as I know this isn't documented anywhere and I only discovered it by reading this answer.

The upshot of this means you can ship the file right along side your application, and actively modify it for granular control of your application's database connections.

Hope this helps you out too...

Sunday, 30 September 2012

What I've learnt about professional software development


I've been professionally developing software for four years now and thought I'd reflect on what I've learnt over that time. I've gone for the traditional bullet-point style (hope you don't mind), and they're in no particular order.

Loyalty to your employer is good as long as it isn't blind


The software industry moves fast, but the systems at your current place of employment aren't as likely so. As soon as you feel like you've stopped learning in your current job, it would be wise to get looking for a new one. Too often people stick to a job for too long, only to realise their skill set has become irrelevant (VB6 anyone?). Therefore it is always wise to try your best to be in employment with skills that are currently in demand.

Know it's impossible to know everything, so concentrate on learning what's important (whilst knowing what exists)


Because the software industry is so large and constantly changing it's simply impossible to know everything. However it's equally important to not reinvent the wheel. So you should keep reading blogs, contribute to stackoverflow and go to as many user groups as you can. It's totally ok not to know everything about new technologies, just concentrate on what they can bring to your projects because good developers can learn on the job.

Get a fresh pair of eyes on your problem as soon as you've tried everything


As soon as you find yourself stuck with a particular problem ask one of your peers to take a look.  Don't worry about appearing stupid, your employer would much rather you be productive as soon as possible.  Too many times I've wasted time debugging a problem, only for a colleague to instantly know the solution. Also others have experienced what you haven't, so make use of it. When offered advice take it and ask questions.

Teach and be taught


Sit with your colleagues, write a blog, build up a stackoverflow reputation, but most importantly do anything which gets your thoughts peer reviewed. The only way you learn is by knowing you already don't know everything, or having a long held belief proved wrong wrong.

Absolutely everything should be as simple as possible


Your brain can only hold a handful of concepts at once, around about seven in fact. It therefore make sense that absolutely everything you do should be broken down into the most simple form possible. Read code complete, constantly refactor your code, and use source control to store the the reasons for your changes.

Users don't always know what they want


This is a big one. Users cannot always conceptually visualise what they want, and only get a narrowing of their requirements when they see something tangible. This is where methodologies such as Agile come into play, their iterative processes promote keeping the users in the loop, who can then crystallise their requirements before you've done too much work.

Ask prospective employers lots of questions (and) understand you can't trust recruiters to do this for you


When looking for new employment you should be interviewing the company as much as they're interviewing you. This means lots of questions. Use your experience to avoid previous employment pitfalls, ask them about the state of their source code and what they expect of you.  In fact you should have lots of questions that you've been collecting over your career.

As for recruiters, my experience is that many are only focussed on their commission and will do whatever they can in order to achieve it.  However there are also really good recruiters out there who genuinely keep your interests at heart.  When you cross paths with such a rare breed you should keep in contact with them even if not looking for employment, because you never quite know when you might need them.  I personally use linkedin to keep in contact, and have recommended the best.

Your experience


That's the end of my list, and hopefully I'll have a handful more points over the next four years.  Perhaps you think I've missed some big ones? Use the comments to share your experience, and perhaps I'll learn from you too.