Small!
- The first rule of functions is that they should be small.
- The second rule of functions is that they should be smaller than that.
There are no references to any research to justify these two assertions, but they come from the author’s experience.
The blocks within if statements, else statements, while statements and so on should be one line long. Probably, that line should be a function call and, if you do so, you get two advantages:
- it keeps the enclosing function small;
- it adds documentary value because the function called within the block can have a nicely descriptive name.
Do one thing
One of the most important rules about functions:
Do One Thing
Functions should do one thing. They should do it well. They should do it only.
Sometimes is not immediate to know what “one thing” is. Let’s consider this function:
public static String renderePageWithSetupAndTeardowns (
PageData pageData, boolean isSuite) throws Exception {
if (isTestPage(pageData))
includeSetupAndTeardownPages(pageData, isSuite);
return pageData.getHtml();
}This function performs the inclusion of some setup and teardown pages into a test page and then renders that page into HTML. The operations it’s performing are three:
- determining whether the page is a test page;
- if so, including setups and teardowns;
- rendering the page in HTML;
Is this function doing one thing or three things? Notice that the three steps of the function are one level of abstraction below the stated name of the function. We can describe the function as a “TO paragraph” where each sentence corresponds to a step in the function at the same level of abstraction: “To RenderPageWithSetupsAndTeardowns, we check to see whether the page is a test page, and if so, we include the setups and teardowns. In either case, we render the page in HTML.” If a function does only those steps that are one level below the stated name of the function, then the function is doing one thing.
We could extract the if statement into a function named includeSetupAndTeardownsIfTestPage, but that simply restates the code without changing the level of abstraction.
Use Descriptive Names
The name of a function describes what the function itself does.
The smaller and more focused a function is, the easier it is to choose a descriptive name. Furthermore, don’t be afraid to make a name long because a long descriptive name is better than a short enigmatic name.
Function Arguments
The ideal number of arguments for a function is zero (niladic). Next comes one (monadic), followed closely by two (dyadic). Three arguments (triadic) should be avoided where possible. More than three (polyadic) requires very special justification—and then shouldn’t be used anyway.
Arguments are hard:
- readers have to interpret each time they see them because the argument is at a different level of abstraction than the function name and forces the used to know the details;
- from a testing point of view, it would be difficult to write all the test cases to ensure that all the various combinations of arguments work properly. With no arguments, this would be trivial.
Output arguments are harder to understand than input arguments because we are used to the idea of information going in to the function through arguments and out through the return value, and we don’t usually expect information to go out through the arguments.
Common Monadic Forms
There are two very common reasons to pass a single argument into a function:
- ask a question about that argument, as in
boolean fileExists('MyFile'); - operate on that argument, transforming it into something else and returning it.
A less common use for monadic functions is to describe an event. In this form there is an input argument but no output argument. Usually that argument serves to alter the state of the system. An example is void passwordAttemptFailedNtimes(int attempts).
Try to avoid monadic functions that don’t follow these forms. For example, using an output argument instead of a return value for transformation is confusing. If a function is going to transform its input argument, the transformation should appear as the return value.
Flag Arguments
Flag arguments are ugly. Passing a boolean into a function is a terrible practice. It complicates the method’s signature, loudly proclaiming that this function does more than one thing. In this case, we should split the function into two separate functions.
Dyadic Functions
There are cases where two arguments are appropriate. For example, cartesian points naturally take two arguments, such as: Point p = new Point(0, 0);. On the other hand, obvious dyadic functions like assertEquals(expected, actual) are problematic because it’s not immediate to remember that the expected is the first argument and the actual is the second argument.
There are cases where you can convert dyadic functions into monadyc functions.
Argument Objects
It’s possible to wrap some arguments into a class of their own. For example, the function Circle makeCircle(double x, double y, double, radius); could be rewritten as Circle makeCircle(Point center, double radius);. It’s not cheating, because Point is a concept that deserves a name.
Verbs and Keywords
Choosing good names is important to explain the function’s intent and the arguments’ order and intent.
In the case of a monad, the function and argument should form a verb-noun pair. For example, write(name) evokes the precise idea of “whatever this name thing is, it is being written”. You can also be more specific by saying writeField(name), which tells the “name” is a “field”. This is an example of the keyword form of a function name. Using this form we encode the names of the arguments into the function name. For example, by using assertExpectedEqualsActual(expected, actual) you don’t need to remember the ordering of the arguments.
Have no Side Effects
Side Effects happen when your function promises to do one thing, but it also does other hidden things. Some examples:
- unexpected changes to the variables of its own class;
- unexpected changes to the parameters passed into the function.
These scenarios could lead to temporal coupling and order dependencies.
The following code is a standard algorithm to match a userName to a password. Though it contains a side effect, that is the Session.initialize() call:
public class UserValidator {
private Cryptographer cryptographer;
public boolean checkPassword(String userName, String password) {
User user = UserGateway.findByName(userName);
if (user != User.NULL) {
String codedPhrase = user.getPhraseEncodedByPassword();
String phrase = cryptographer.decrypt(codedPhrase, password);
if ("Valid Password".equals(phrase)) {
Session.initialize();
return true;
}
}
return false;
}
}Note that checkPassword function, by its name, says it checks the password. The name does not imply that it initializes the session. So, a caller who believes what the name of the function says runs the risk of erasing the existing session data. This side effect creates a temporal coupling because checkPassword can only be called at certain times, that is when it’s safe to initialize the session. If it is called out of order, session data may be inadvertently lost. If you have a temporal coupling, it should be clear in the name of the function, such as checkPasswordAndInitializeSession, though it violates “Do one thing” principle.
Command Query Separation
Functions should either do something or answer something, but not both. Either your function should change the state of an object, or it should return some information about that object. For example, consider the function:
public boolean set(String attribute, String value);This function sets the value of a named attribute and returns true if it is successful and false if no such attribute exists. This leads to an odd statement like this:
if (set("username", "unclebob")) ...From the reader’s point of view, it’s hard to infer the meaning of this statement because it’s not clear whether the word set is a verb or an adjective thus it’s not possible to understand if the statement is asking if the the username was previously set to unclebob or if the username was successfully set to unclebob. The solution is to separate the command from the query so that the ambiguity cannot occur:
if (attributeExists("username")) {
setAttribute("username", "unclebob");
}