How to Write Clean Code

Boy, it’s been a while since my last post. I haven’t been blogging partially because I had nothing to say and partially because I had no time. This post will hopefully break the silence and at the same time be useful to my fellow PHP developers out there.

I’ve been talking about clean code and testability for quite some time now. It is simply impossible to cover all the techniques and explain them to a new audience in 40-some minutes during a meetup or a conference.

In this post I will share some of the techniques I use when designing code-bases of open-source libraries that I’m working on and how I think the design I chose helps others to keep their code clean and testable. This post was prompted as sort of a followup to discussions like the one we had on Symfony2 dev mailing list recently. Here I want to state my opinion and provide reasoning for what its worth.

Start with final classes:

When coding a class, I usually use TDD, meaning I write the test for my class before the actual implementation. At that point I usually have no idea how that class is going to look like, what public API (unless I have already partially discovered it from testing another class) it is going to have, which role in the class hierarchy it will take and whether it will have one at all. So, I start out by declaring the class as final, and use private properties and methods, because at that point, that class is final and not part of any inheritance trees.

This both keeps me from extending the class myself later on by also forcing me to think about how I want the class to be extended.

Mock or Stub by Interface:

During the coding of my class, I start seeing some of its dependencies and what they should be doing. At this point I don’t want to code real classes of my object collaborators yet, but I need those collaborators at the same time. So I create an interface for my future collaborator and mock it in the test.

The reason I advice mocking interfaces is simple - concrete classes can be final or can have some of their methods declared as final, at which point mocking is impossible. As we know, an interface in PHP (and in OOP in general) is a contract for classes that are going to implement it, as well as for classes that are going to collaborate with it’s implementations by type-hinting their methods. I think it makes sense to use such contract in cases where you want to replace actual class instance with a test double (be it a stub or mock), since no matter which one you chose, it is going to be an alternative implementation of the real object that needs to follow the same contract. Also keep in mind, that some language specifics in PHP encourages you to use interfaces.

note A mock in PHP should conform to the type-hint of the class being mocked, to achieve the mimicking of that class. Internally, PHPUnit generates a new class with obfuscated name, that extends or implements the class or interface being mocked accordingly. Hence, if the class-to-be-mocked has final methods, they won’t get overloaded in the mock, which may lead to unexpected behavior in tests. Even if the concrete class changes some methods to final later on, the tests that were once working will start breaking while no real public API changes occurred.

Refactor to inheritance:

Starting with final classes is important, because it forces us to make an extra step on our way to inheritance and there is a reason I want that step. Inheritance is very a useful and powerful feature of OOP (I feel like I’ve heard these exact words hundreds of times already) and I am not trying to de-value it. When it comes to programming, inheritance is a way to extend code by adding custom behavior to the child classes, without re-implementing what is already working in the parent, which is great and helps code-reuse a lot.

However. In languages like PHP, where we, poor developers, don’t have the means of horizontal code-reuse (yet?) like mixins or multiple inheritance, extending one class also means that it will not be possible to extend another. I personally feel that a decision like this is very serious and try to defer the need of making one until I know more about the system I’m building and the problem I’m solving. Programmers might find themselves in the middle of interesting problems if that principle is not followed.

Typically that means, that when I finally do extend some class:

I have an interface that I need to conform to
Classes at the bottom of my hierarchy are typically final
Classes at the top of the hierarchy are usually abstract
Most of the class members are private
Only methods and properties that need to be extended are protected

For every class operating on internal collaborators there has to be an interface:

The statement above might not be clear to everyone, so before justifying it, let me be clear on what I mean.

Assume you have a library that sends emails (SwiftMailer). That library has the Mailer class and Transport classes. The Mailer class can be configured with a Transport of choice (think SMTP, SendMail, etc.). What I mean is that the Mailer class should have a MailerInterface that it implements, because the class itself relies on collaborators to work. At the same time classes that are responsible for only tracking their internal state like value objects or PHP’s DateTime don’t need an interface.

The rationale here is simple - whenever I need to test a class that collaborates with Mailer, I don’t want to spend time on complicated setup of the Mailer object. Instead, I want to mock it and tell the object how it should behave in the test. The presence of the interface makes is that much simpler.

For every class that will be used in user-land code there must be an interface:

The rationale here is also somewhat simple: let’s be kind to other developers, and provide them a shortcut to stub or mock our library classes they will have to interact with directly from the classes they own, without having to worry about complicated setups in tests.

Every part of the library that can be extended must have an Interface:

If a user wants to provide alternative implementation of some class and the library is designed to allow that, there must be an interface that the user class can implement. In case of SwiftMailer, that means a TransportInterface to let us provide alternative email transportation means.

note Even if you designed an abstract class that needs to be extended, there should be an interface that lets users of the library write their own implementation from scratch. While an Interface is a contract, an Abstract Class is a suggestion and should not be considered a contract on its own.

Don’t force users of your library to use static methods:

I feel static methods are probably one of the biggest lies in OOP. They give you a sense of object oriented design, while they really are functions that live in global space, that cannot be encapsulated or replaced with test implementations if used inside objects and lead to all sorts of problems. There, I said it. Now let me try to explain myself.

When one calls a static method, it looks like Class::method(), that means that our code is all of a sudden dependent on the class Class (I know…), which, despite of all of our interfaces and best practices, binds us to a concrete implementation and most importantly, denies us from checking that our code actually calls this method internally while testing (unless we modify state from inside the static method itself, which is asking for even more trouble).

There is a good summary of the reasons static methods kill testability on Miško Hevery’s blog.

Conclusion:

When designing a system, especially the one to be used by others, one should concentrate on extensibility and flexibility. When I say extensibility, I don’t mean “leave all your classes open to inheritance, use protected properties everywhere, so that any class could be extended and changed to the core”. In fact, a technique like that kills flexibility on the side of library developers and maintainers by making refactoring impossible.

Extensibility means “let the system be extended to perform more than what it can initially”, but there are many means of doing it, composition and dependency injection being the most powerful ones. A well-designed system will result in stable API that can be extended over time without worrying about backwards compatibility (BC) breaks, just like the open/closed principle suggests, by making no changes to the core class and extending or decorating it to achieve more.

note A “refactoring” that breaks BC, could not be considered such, as refactoring by definition is … changing source code without modifying external behavior to improve code-reuse and design. Code that is used by end-users can be considered public.

note Dependency Injection means the user injects the dependencies in the setup part of the application, it definitely does not mean the user can pull in or suck in the dependencies by having a service lookup (that pattern would be probably called Dependency Sucking). In dependency injection you only pass around what’s needed and not shove all objects into some kind of service locator class (think Registry) and let other classes extract what they need. One of the advantages of DI is the fact that by re-assembling the system components, we can achieve different behavior for the end system, this advantage is lost with service lookups hard-coded in concrete classes. I feel that the clarification is important as even some of the most well-known PHP-ers can mix the terms sometimes, let alone everyone else.

Happy coding!

Software Alchemist

I am constantly seeking answers and ways of transforming desires into reality by coding