Tag Archives: refactoring

Retrofitting test cases

The main project I work on, has an automated unit test application which is built and run as part of the build process.  For our office, the team was an early adopter of the concept, but unfortunately this does not translate into extensive and well maintained set of unit tests.  Put politely: it would be good to improve this coverage.  This then raises a fairly obvious question: Where do you begin?
First of all, it is worth analysing the statement: “It would be good to improve this coverage”.  There are a couple of benefits to having unit tests and these benefits help explain the statement.
Avoidance of regression bugs.  This is reasonably self-evident.  If you have sufficient coverage of your classes, you cannot introduce faulty logic without a unit-test failing. Over the years many quality assurance staff have told me “the earlier you catch a bug, the less expensive it is to fix”. So many, in fact, that I now believe them.  (In truth I don’t think I ever doubted the “earlier = cheaper” argument) Anyway, if you are running unit tests as part of the build process and building regularly it stands to reason that any bugs introduced and caught, will be fixed cheaply.
Code re-use.  A less obvious benefit is that the code you are testing now has two uses.  One in the application and one in the test-case.  While this may seem a contrived second use, it should help with code abstraction.  The theory goes that the more usages a class has, the less chance it has of being tightly coupled with other classes.  Tightly coupled classes increase code complexity.  The more tightly coupled they are, the more likely a change in one class will introduce a regression bug in another class.
Now that we have defined a couple of benefits that we hope to achieve through the use of unit tests, it helps define where we should begin.  We want the unit tests to reduce instances of regression bugs and improve code abstraction – which is enforced by re-use.  History logs of the revision control system can be studied to show the frequency of changes in a unit.  If a mature project has a hot-spot of continual changes to a given class, then that may well be an indicator of frequent regressions and “hopeful” changes rather than “thoughtful” changes.

There is a very good chance that such a class violates rules of the single responsibility principal and in turn makes writing unit tests for it an unviable proposition.  Now that we have identified what is wrong, we have a good place to start:

  • Look through the class and identify the different responsibilities the class has.
  • Extract the responsibility that is least entangled throughout the class.
  • Write unit tests for this new class.
  • Rinse, repeat.

Once you have extracted a new class and written the unit tests, your changes for it are not necessarily complete.  As you extract more classes, there is a chance that your new class can be further decoupled from the original.  In my experience, the important thing to remember is that just because you remove a responsibility from a class does not immediately decouple the two classes.  Proper decoupling can take several iterations.As I stated, start with the easy abstractions first.  As classes become less entangled, the harder areas will become easier to deal with.  At least, that’s the theory!

Commercial reality

One of my favourite blog writers is Eric Sink.  I find his writing style entertaining and informative.  I do not find Revision Control Systems the most interesting subject matter. I use one, it works, I am happy.  But for Eric, they are his speciality.  After all, he owns a company that writes them.  In a recent article, he discussed speed vs. storage space trade off.  He used a source code file containing a single class, which was 400KB in size for the latest revision as his “guinea-pig” and braced himself for a barrage of comments regarding whether such a file constitutes an example of poor coding.

Certainly, when you take into accounts things such as the single-responsibility principle, it seems unlikely that a single class could grow to such a size.  It would seem that such a class could be a target for a future refactoring exercise.  Refactoring is a worthy cause.  There is no shortage of reading material that carefully constructs solid arguments for why it should be done.  But a more worthy cause is making software sell.  Pet projects and home-hobbies aside, software is of no benefit, if no one is using it. 

Commercial reality dictates that if a new feature will help sell more copies of the software, then adding the feature is what is important.  It is true that the more obfuscated code becomes, the harder it is to expand to incorporate new features.  I have heard of software projects grinding to a halt because adding new features simply became too difficult to accommodate. 

Using commercial pressures as an excuse to write sloppy code is not acceptable.  I have seen examples of code that looks like “the first thing that popped into the developer’s head” has been committed to the Revision Control System.  Often, with very little extra thought (read “time”) a neater, better solution could have been found.  This is where task estimation is important.  In my experience, programmers will use the amount of time allocated, to perform any task.  In all likelihood, they will get to the end of that time-frame and realise they overlooked at least one aspect and then take longer, but that is not the point I am trying to make here! 

If you have allocated “a day” to add a feature, then most often, that is how long it will take to add.  If you had allocated “half a day” for the same feature, then I would wager that the feature would have been added in about that half-day timeframe.  Granted, this is not always the case, but experience has shown me how often this is a surprisingly accurate revelation. 

This stems from the fact that if a programmer knows how long they are expected to take, they will get it working first, then tinker with the code until the time has elapsed.  Some “tinker time” assists in overall code readability.  If you are not prepared to add “code refactoring tasks” to your project plan (regardless of the project methodology you use) then allowing a certain “slackness” in task estimation allows your code a fighting chance of staying relatively neat.

When time pressures arise, neatness and accuracy of code are amongst the early casualties.  Unfortunately, this seems unavoidable and is just the price that is paid to remain profitable.  Whilst I strive to write and maintain neat, manageable, accurate code, I live in the real world and know that regularly revised source code over two years old (Eric’s example was seven years old) will likely be of the “too long / overly complex” variety.  I will not be one to criticise him for that.

Identifying what is wrong with code

There are a number of head-nodding moments in Clean Code The book mentions many coding guidelines and good-code principles that aim to simplify code that is written.  Software development is definitely a career where you must continuously learn.  Even so, I’m somewhat disappointed to be this far along in my career before learning the names of certain concepts that until now I have only known as “common sense”. 

There have been times when I have identified “bad code”, but pressed to state what was so bad about it left me speechless…  This does not make the task of improving the code impossible, but can leave you with an inability to justify and/or quantify the changes you make. 

Here are some of the more common rules:

One level of abstraction per function.

This rule is not given a “trendy term”. Unlike the other rules I’ll introduce it is dealing with functions, rather than classes.  But it is still important.  This rule possibly came from “days of old” when top-down procedural programming was all the rage.  It’s still important and sans lengthy example is best summarised as:  Do not mix details with abstract concepts in the one function.  If you stick to the idea that each function should only do one thing, then you will be going a long way towards avoiding mixing abstract concepts and technical detail.

Single Responsibility Principle

The Single Responsibility Principle (SRP) states: There should never be more than one reason for a class to change.  The object mentors web-site has a series of PDFs discussing various aspects of Clean code, and has this (in part) to say of SRP:

“If a class has more than one responsibility, then the responsibilities become coupled.  Changes to one responsibility may impair or inhibit the class’ ability to meet the others.  This kind of coupling leads to fragile designs that break in unexpected ways when changed.”

In a rather circular fashion, the definition of a responsibility in SRP is “a reason for change”.  In other words, a class written to obey the SRP only has one responsibility.  If we were to develop an Object-Oriented “Space Invaders” game, we may end up with a class for the aliens, that looked like:

 public class Alien
    public void Move()
    public void Render()

However, this is a violation of SRP.  We now have two reasons for changing this class.  If the AI model changed, the calculation of the Move method may require changes – likewise if there were changes to the Graphics subsystem, the Render method would alter.


Law of Demeter

This is also known as the Principle of Least Knowledge.

“The fundamental notion is that a given object should assume as little as possible about the structure or properties of anything else (including its subcomponents).”

Object Oriented notation assists here, by having different scopes for the methods and member variables defined in a class. Clean Code points out that often you will see private member variables (good so far) exposed through public getters and setters methods (not so good).  This exposes the implementation details of a class allowing for other classes to violate the Law of Demeter.  Better (so the book argues) to expose the data through abstract concepts

“A class does not simple push its variables out through getters and setters.  Rather it exposes abstract interfaces that allow its users to manipulate the essence of the data, without having to know its implementation.”

The example given talks about a vehicle class that exposes the “percentage of fuel remaining” as opposed to “fuel tank capacity” and “gallons remaining”. 


This is by no means a definitive list of ideas presented in the book.  From my experience, violations of these principles represent the vast majority of “ugly code” that I have seen.  Once these coding techniques have been highlighted, it is possible to see where improvements can be made in existing code.  By refactoring code to utilise these principles, you can identify that tangible improvements have been made.