How to work with Legacy Code

When you’re learning to be a developer the examples you learn from and the projects you work on are made from scratch (called a greenfield project) and it’s most likely you’ll be the only person working on the code too.

It’s a harsh reality when you enter the IT industry to find that most of the work will be working with existing code bases and in a team of people.

The code base and the team will have a set of conventions you’ll need to learn in order to understand the context of why the code was written and what the team is looking to achieve.

This archaeological dig for context is as much a developer’s job as writing code and fortunately it’s something that can be made easier.

What is Legacy Code?

The most common scenario for Legacy Code is when a team takes on another team’s codebase with minimal time for handover meaning that while the high level functionality has been covered the underlying code is still a mystery.

Other scenarios include:

  • Returning to projects after the team has context switched
  • Working in an area of the codebase that you didn’t write or review and the original team members are unavailable
  • Working with code that has out of date documentation and because of this the behaviour of the code doesn’t match the documented intent

Understanding the original intent

Definition of intended behaviour

This behaviour can be found in:

  • User stories or other work items
  • Acceptance criteria or test scenarios
  • Automated tests such as unit, integration or system tests
  • Pages found in the company’s knowledge base, covering the functionality
  • Operational and Support documentation
  • Version control entries such as Git commits

Access to someone from the original team

I’ve seen benefit from talking to someone from the original team when they’ve worked directly with the functionality and have been able to recall the intent behind the design decisions made.

However, I’ve also seen the person from the original team cause more chaos as they were retro-actively adding behaviour that wasn’t in the original intent and due to this, the developer they were working with was getting more confused (I used to work on that original team too which is why I knew they were adding things not in the original specification).

Access to a running version of the application or users

If you’ve got access to a user of the application then you might be able to understand the way the behaviour has evolved over time, which can add a little more context to things but this can also lead to a subjective portrayal of how the user uses the system and not what the team intended to happen.

Adding context back into Legacy Code

This documentation can be static pages on a knowledge base somewhere that list what the intended behaviour of the code is or it can be dynamic, executable documentation such as automated tests.

In order to get the most benefit you should look to have this documentation as close to the code as possible. Unit tests in particular are a great way of doing this as you can write a set to cover off the behaviour you know about and as edge cases pop up you can add more to test that behaviour too.

Sometimes, especially with a large or fragmented codebase it’s not easy to add unit tests or integration tests so I would suggest a set of end-to-end automated tests (again try to get these as close to the code as possible).

Once you’ve got the behaviour of the functionality documented you then have a safety net with which you can catch any regressions to that functionality while you work on the code and make it easier to work with.

Refactoring Legacy Code

Separate concerns

By separating the concerns the new functions have a clearer scope of what they’re intended to do and if the team spots similar functionality elsewhere in the codebase they can replace that with a call to the new function.

As the code gets separated be sure to add unit tests to these smaller functions so you’re not creating even more legacy code for future developers, even if the functionality is covered by the testing done against the original function.

Modularise code

By creating versioned modules of the code you can take control of the increments that make it into production as you can release individual modules separately while also verifying that a collection of modules work together.

Much like long-lived feature branches it’s important that you integration test often and if possible you should look to release small increments to production, using techniques such as feature toggles to ‘switch off’ functionality until it’s complete and ready for launch.

Create change requests

While you may feel that the behaviour you’re seeing is incorrect there may be some contextual information you’ve not yet discovered that means the existing behaviour is actually correct.

By raising change requests instead of bugs or rewriting the ticket you can track the number of changes needed to bring the system behaviour inline with the team’s understanding while also tracking the number of actual defects separately.

This also gives the Product Owner a clearer backlog to work with as they can approach the change requests in a different context to the bugs. Similarly, testers on the team will be able to better track these changes instead of treating the existing functionality as a ‘known issue’.

How to prevent your code from becoming Legacy Code

When working with any code you should look to make this future developer’s job as easy as possible to understand the context of the functionality and design decisions you’re making.

With a bit of team discipline it’s relatively easy to embed this context and make it easier to find using the following techniques.

Make your version control messages meaningful

If you’re using a system that allows for Pull Requests, you can use the description box to explain the context around the changes being made and the comments left by the reviewers will also help to provide contextual information.

Write automated tests

By writing tests, even high level system or integration tests, you start to bring the contextual information back into the code and make it easier to refactor it at a later date, even if you don’t think you’ll have time to do so.

As the tests are automated they can be run quickly and provide a lot of feedback on what the code does so any developer picking up the code can learn in a matter of a few minutes what might usually take a number of days to unearth through other means.

Use consistent design patterns

By using design patterns you’re providing future developers with a tool to frame the decisions being made and even if they don’t agree with those decisions they’ll be able to understand them better.

Keep your documentation up-to-date

In order to make things easier to keep updated, look to have a single source of truth and incorporate traceability into the codebase and other artefacts so future developers know where to go to find more information.

A better approach would be to implement bi-directional traceability so the source of truth also holds contextual information on the code and test cases for that functionality, making it really easy for future developers to know where to look in the code to make changes and the tests that might be impacted.

Summary

By understanding that Legacy Code is an issue of a lack of context a development team can work with the Product Owner to ensure that time is given to build up this contextual understanding and better estimate the work that’s being asked of them.

Additionally by refactoring code and putting measures in place to prevent their code from being Legacy Code, the development team can decrease the amount of time that future development effort in that area will take.

Originally published at https://averment.digital on June 23, 2020.

We are a small team of passionate people who love working with all things technology and watching that technology build businesses.