Working Efficiently With Legacy Code

Sunday, September 2, 2012

This is my summary of the book “Working Effectively with Legacy Code” by Michael C. Feathers.

Gist of the book

Legacy code is untested code.
Bad code is untested code.
Even good, untested code is bad code.
Write tests. Write code. Write more tests. Write more code. Write more tests.

Did you say TDD?

Yes, this book is about tests. It introduces some vocabulary to talk about tests, then actually talk about tests by looking at specific use cases. It specifically looks at legacy use cases, where putting a piece of code under test can lead to a lot of stress.

Introduction to new concepts

If you’re new to TDD, and if you’re never written tests before you probably have some catchup to do on vocabulary. This book introduces those potentially new concept pretty well.

Test harness

A test harness is a piece of code written to exercice some piece of software.

Sensing

Getting access to internals of a piece of software. Think about loop indexes fot instance. Or internal state of an object. Things that you usually have to print out in order to debug your program.

Fake and mock objects

A fake object is an object to enable test harness. It basically exists in place of a real object, for instance a DB connection.
A mock object is a fake object containing assertions.

Seams

A place where a program’s behavior can be changed without editing it.
For instance:

preprocessing seam
link seam
object seam Every seam has an enabling point (classpath, call to a method, etc).

Pinch points

A pinch point is a point where tests can detect change in meny methods at the same time.

Characterization tests

Tests that describe the behavior of the piece of software in its harness.

Sprouts

Sprouting refers to a technique aiming at extracting new behavior to a new method (“sprout method”) or class (“sprout class”).

Wraps

Wrapping a method (“Wrap method”) or a class (“Wrap class”) refers to the process of encapsulation or decoration to add new behavior to legacy code.

In the case of the Wrap Method, the new behavior is implemented in a new method, and the old function calls the new one, so that new behavior will be called by old(new()).
The Wrap Class is really similar: the new class inherits from the old one and adds the new behavior required.

Dependency inversion principle

Code should depend on interface, not implementation.

Open/Closed principle

A piece of code should be open for extension but closed for modification, which basically means a good design doesn’t require to change much code: you just have to write new code for a new feature.

How should you develop a new feature?

TDD + programming by difference (subclass and override behavior).
Gotcha: “Liskov Substitution Principle” — avoid overiding concrete methods. The ideal hierarchy is a hierarchy where no class overides one of its parents' methods. You can achieve that via abstract classes and inheritance from interfaces instead.

I can’t get this class into a test harness

Extract interface and subclass into a test. That’s how you can get around annoying params.

I can’t run this method into a test harness

First, ask yourself if you should run this method to begin with. You should avoid testing private methods. Test public interfaces or make a private method public if you really need to test it.
Cheap solution to test a private method: make it protected and subclass.

Command/Query separation principle

A “query” method is a method that has no side effect and returns a value.
A “command” method is a method that has side effect and returns no value.

The “Command/Query separation principle” simply dictates that a method shouldn’t be a query and a command at the same time.

Finding test points

Learning to reason about effects, draw effect stucture and find pinch points. Make sure to include subclasses and other callers in the system to have a full picture.

I don’t know what tests to write

If you’re working with legacy code, write characterization tests:

aim for piece of logic, don’t hesitate to make use of sensing variables
think about the responsabilities of your code and cber what can go wrong
test extreme values of inputs
try to look for and expose invariant

I don’t understand the code well enough to change it

If you don’t understand the code well enough, here are a couple of things you can try:

print the code and try to make sense of it
scratch refactoring: just experiment with refactoring the code, make it run…but don’t commit!
look for, and delete unused code

Dealing with big classes

The problem with big classes is that they tend to be confusing, not well-tested (because they ARE a pain to test), and extremely hard to change since intance variable can make any refactor affect the whole class.

Single Responsability Principle

Every class should have a single responsibility: it should have a single purpose in the system, and there should be only one reason to change it.

Heuristics to break a monster class

Several heuristics are listed in this chapter:

Group methods together. Look at names to give you hints about what methods should go together. Try to look for instance var reuse
Look at hidden methods (protected or private). They often indicate that another class should emerge.
Try to think of decisions that can change. This is a hard one, but try to see hard-coded decisions in the class. Now make them configurable.
Point out internal relations or isolations. Very often, instance variable are used by parts of a class, but not by another.
Describe the responsability of that monster class as a single sentence. Write it as a comment at the top of the file, and cut what’s not relevant in another class. Quite drastic.
Scratch refactoring: let your imagination flow and edit some code. Don’t commit.
Focus on your work (new feature) and evaluate how this impacts this class' responsability. Hopefully you’ll see things through a new lens.

Interface Segregation Principle

When a class is really large, the other parts of the system interacting with it rarely need to use all its methods at once. So we can use different groupings of methods to “segregate” the clients. The way to do it is to define those groupings as interface and make the monster class implement them all. That will bring isolation between the large class and the rest of the system.

Dealing with big methods

Big methods are kind of the same as big classes. They should be avoided because monster methods are complex, confusing, hard to test and really difficult to change.

Two kinds of big methods out here: bulleted methods and snarled methos. A bulleted method is a method having a flat structure. It’s basically saying: do A, then do B, then do C, etc. Snarled methods are worst because they have conditionals, and loops mixed in. Their structure is dominated by indented blocks, which makes them far less readable.

Misc

A “slow” test is a test that takes 0.1s to run or more.
Abbreviations are okay when they’re used consistently, but they are generally a bad idea. Try not to abbreviate when you can avoid it.