Nowadays, as great tools for automatic testing are emerging, more and more developers will accept the cost of automatic testing to in order to harness the benefits of better code design, better code documentation, and fewer regression bugs.
However, a lot of code is still not automatically tested -- and, even worse, a lot of code is not automatically testable. Often, the challenge for developers is adding new features to such untested code without adding new bugs. In this article, we propose some simple, pragmatic and concrete solutions that will reduce the risk of regression bugs.
In the real world, automatic test coverage of 100% of a code base represents an ideal goal that can't be achieved. Some parts of the code are very hard or even impossible to test automatically, especially code dedicated to the UI.
Even if you don't have a single line of business code in your UI code -- which, for your sake, I hope is the case -- you still have UI code. Typically UI code associates handlers to UI events, enables/disables controls, provides code hacks to circumvent tricky UI framework behavior and bugs, does some drawing, and so on.
In the case of an application with a complex UI, such as the NDepend tool for .NET developers, there are more than 12,000 lines of UI code from a total of about 50,000 lines of C# code. So far, our automatic tests cover around 60% of our code (almost 30,000 lines), and we know that we will face a 'test coverage barrier' when we try to go beyond 70% coverage.
What can we do to try to ensure the correctness of such code? The most important thing is to test it relentlessly before integrating it in any production release. But then how can we guarantee the correctness of such code within the next releases? How can I be sure that, for example, the WinForms code generator didn't decide to get rid of some of my UI event handler plugins? (Any resemblance with a real-world situation is intended). Here, there are only two options:
- Before each public release, you manually test all the code that has not been tested automatically. Note: This is a sure-fire way to kill a project's agility.
- The second option is to carefully review code that has been modified between the last and the upcoming public release.
This second option concentrates your focus on a bounded subset of critical code. This approach makes sense, because all experienced developers have learned two things the hard way:
- Most of new bugs in a new version are coming from modified code and brand new code.
- Unchanged code that works well for a long time won't likely crash within the next release (what we call stable code). We don't say that stable code don't contain bug, but discovering a bug in stable code is rare.
What to do before each public release?
Concretely, here are the four steps the NDepend team undertakes before each public release:
- Make sure that none of the automatic tests are broken. Even better, we run automatic tests several times a day with continuous integration frameworks and tools each time a change is committed to the code base.
- Make sure that our code coverage ratio didn't decrease since the last public release. If such is not the case, we need new automatic tests.
- Carefully review code changes between the code base of the last public release and the current code base. During this step, we collect bug fixes, code refactoring and features added that we will enumerate in the Release Notes. The good news is that generally the amount of code that has been changed represents a small portion of the entire code base.
- Manually test all changes and new features in release mode. If some bugs are discovered, then we fix them and we go back to Step 1. To make things easier, we then consider reviewing only code that has been changed to correct these last minute bugs.
This methodology leads us to develop the Compare Build feature of the NDepend tool. After all, NDepend is a tool for .NET developers and we are .NET developers. We are our first users; we eat our own dog food! What we really needed was to know precisely the set of methods changed, added or removed since the last public release.
Modern source code repositories (such as CVS, SVN, VSS, TFS, ClearCase and others) can tell you about source file code changes, but they are a bit naïve. They can't see the difference between a comment changed and a line of code changed. They can't discern that a class has been moved from one source file to another. They don't know about visibility, method signature, or field encapsulation. They can suffer from the fact that, when looking for code changes, coping with numerous false positive is tedious.
Comparing two code-base versions
We realized we needed smart exploration of the difference between two versions of our code base. So, we added new facilities to the CQL (Code Query Language). CQL is to your code base what SQL is to your relational database: a query language. Here are the set of primary CQL queries that we needed:
SELECT METHODS WHERE CodeWasChanged SELECT METHODS WHERE WasAdded SELECT METHODS WHERE WasRemoved SELECT METHODS WHERE CommentsWereChanged SELECT METHODS WHERE VisiblityWasChanged SELECT METHODS WHERE BecameObsolete
It is interesting to use the NDepend metric view to get an idea of where changes occurred and the size of the impact. In the metric view, each rectangle represents a code element (here, rectangles are methods). The size of a rectangle is proportional to the metric value for the corresponding code element (here the size of a rectangle is proportional to the number of lines of code of the corresponding method).
What follows are some screenshots in which we compare the code of NDepend v2.3.0 and NDepend v2.4.0.
In Figure 1, 836 methods and 7,791 lines of C# code were added.
In Figure 2, 300 methods and 2,554 lines of C# code were removed.
Figure 3 shows that 184 methods were changed.
Creating rules to constrain how code is modified
The CQL language natively allows writing rules or constraints. A CQL constraint consists in automatically warning if the result of a CQL query is different than a certain expected result. For example, the following CQL constraint warns you if the code of a method has been changed without comment update:
WARN IF Count > 0 IN SELECT METHODS WHERE CodeWasChanged AND !CommentsWereChanged
A bigger problem that framework developers are facing is to avoid breaking changes to ensure compatibility amongst versions. The two CQL constraints below warn if a method (or a type) that was public is not public anymore (i.e its visibility has changed or it has been removed):
WARN IF Count > 0 IN SELECT METHODS WHERE IsInOlderBuild AND IsPublic AND (VisibilityWasChanged OR WasRemoved) WARN IF Count > 0 IN SELECT TYPES WHERE IsInOlderBuild AND IsPublic AND (VisibilityWasChanged OR WasRemoved)
To avoid breaking changes, and to avoid breaking client classes that implement your interfaces, you must also make sure that public interfaces are not modified:
WARN IF Count > 0 IN SELECT TYPES WHERE IsInterface AND IsPublic AND WasChanged
Often, we have to cope with some fragile code that has become stable code. We then decide to freeze such code until complete refactoring. We then want to be warned if a developer accidently modified this temporary code:
The code of Namespace1 and Namespace2 should not be modified until we refactor itWARN IF Count > 0 IN SELECT NAMESPACES WHERE CodeWasChanged AND (NameIs "Namespace1" OR NameIs "Namespaces2")
Static analysis tools such as FxCop or NDepend typically emit thousands of suggestions when they analyze an existing code base for the first time. Developers are then reluctant to apply these changes on existing code, but they are willing to ensure the quality of brand new code. CQL can then be used to restrict suggestions on code recently changed or added:
WARN IF Count > 0 IN SELECT METHODS WHERE (CodeWasChanged OR WasAdded) AND (NbLinesOfCode > 20 OR CyclomaticComplexity > 10 OR PercentageComment < 30 …)
Other information obtained when comparing two code versions
We believe that the delta between two versions of a code base contains precious information. However, this information is often ignored because of lack of tools.
For example, it can be useful to know how tier code use has evolved. Tier code is made of the .NET Framework and any others tier libraries used by an application. The CQL query below lists the tier methods that were not used by the previous version and that are now used…
SELECT METHODS WHERE IsUsedRecently
..and the following query lists the tier methods that were used by the previous version and that are not used any more:SELECT METHODS WHERE IsNotUsedAnymore
The CQL queries below list the tier types/namespaces/assemblies that contain some code elements that match IsUsedRecently or IsNotUsedAnymore:
SELECT TYPES WHERE IsUsedDifferently SELECT NAMESPACES WHERE IsUsedDifferently SELECT ASSEMBLIES WHERE IsUsedDifferently
Several times we anticipated some problems by analyzing how tier code use has changed. For example, we developed our own Path framework to circumvent some System.IO.Path limitations and thanks to this feature, we realized that System.IO.Path was still in use in some methods. (This framework is named NDepend.Helpers.FileDirectoryPath and we released it as an open-source project on CodePlex.)
There is some other interesting information that can be inferred by analyzing the delta between two versions of an application. For example, before releasing a new version, we like to know how coupling between our layers have evolved. The Dependency Structure Matrix (shown in Figure 5) displays a red tick to cells that represent a modified dependency. If the dependency has been added, the red tick contains a plus sign, but, if the dependency has been removed, the red tick contains a minus sign. Notice also that code elements that have been modified are underlined and those that have been added are in bold.
In this article, we tried to introduce tools and recipes to cope with the evolution of code not automatically tested. Obviously, the best thing to do is to refactor untested code to make it testable but this is unfortunately not always possible. By applying the method outlined, we can avoid many regression bugs while regularly adding new features.
About the author
Patrick Smacchia is a .NET MVP involved in software development for more than 15 years. He is the author of Practical .NET2 and C#2, a .NET book conceived from real-world experience with 647 compilable code listings. After graduating in mathematics and computer science, he has worked on software in a variety of fields including stock exchange at Société Générale, airline ticket reservation system at Amadeus and a satellite base station at Alcatel. He's currently a software consultant and trainer on .NET technologies as well as the lead developer of the tool NDepend, which provides numerous metrics and caveats on any compiled .NET application.