Repository: glen9527/Clean-Code-zh
Branch: master
Commit: 28a2ca4069d9
Files: 25
Total size: 704.8 KB
Directory structure:
gitextract_0vuhkduc/
├── .gitignore
├── LICENSE
├── README.md
├── docs/
│ ├── .vuepress/
│ │ └── config.js
│ ├── README.md
│ ├── apA.md
│ ├── ch1.md
│ ├── ch10.md
│ ├── ch11.md
│ ├── ch12.md
│ ├── ch13.md
│ ├── ch14.md
│ ├── ch15.md
│ ├── ch16.md
│ ├── ch17.md
│ ├── ch2.md
│ ├── ch3.md
│ ├── ch4.md
│ ├── ch5.md
│ ├── ch6.md
│ ├── ch7.md
│ ├── ch8.md
│ └── ch9.md
├── gitee-deploy.sh
└── package.json
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
node_modules/
docs/.vuepress/dist/
================================================
FILE: LICENSE
================================================
The MIT License (MIT)
Copyright (c) 2018-present, Yuxi (Evan) You
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
================================================
FILE: README.md
================================================
# Clean-Code-zh
《代码整洁之道》中文翻译
在线阅读:[http://gdut_yy.gitee.io/doc-cleancode/](http://gdut_yy.gitee.io/doc-cleancode/)
## 前言
## Index
- [第 1 章 整洁代码](docs/ch1.md)
- [第 2 章 有意义的命名](docs/ch2.md)
- [第 3 章 函数](docs/ch3.md)
- [第 4 章 注释](docs/ch4.md)
- [第 5 章 格式](docs/ch5.md)
- [第 6 章 对象和数据结构](docs/ch6.md)
- [第 7 章 错误处理](docs/ch7.md)
- [第 8 章 边界](docs/ch8.md)
- [第 9 章 单元测试](docs/ch9.md)
- [第 10 章 类](docs/ch10.md)
- [第 11 章 系统](docs/ch11.md)
- [第 12 章 迭进](docs/ch12.md)
- [第 13 章 并发编程](docs/ch13.md)
- [第 14 章 逐步改进](docs/ch14.md)
- [第 15 章 JUnit 内幕](docs/ch15.md)
- [第 16 章 重构 SerialDate](docs/ch16.md)
- [第 17 章 味道与启发](docs/ch17.md)
- [附录 A 并发编程 II](docs/apA.md)
## 本地开发 & 阅读
本项目基于 vuepress 进行开发,以提供比 github mardown 更佳的阅读体验
依赖于 `node.js`、`yarn`、`vuepress` 等环境
```sh
# vuepress
yarn global add vuepress
# 本地开发
git clone https://github.com/gdut-yy/Clean-Code-zh.git
cd Clean-Code-zh/
yarn docs:dev
# 本地阅读
http://localhost:8080/doc-cleancode/
```
## License
[MIT](https://github.com/gdut-yy/Clean-Code-zh/blob/master/LICENSE)
================================================
FILE: docs/.vuepress/config.js
================================================
// .vuepress/config.js
module.exports = {
// 网站的标题
title: "Clean Code 中文",
// 上下文根
base: "/doc-cleancode/",
themeConfig: {
// 假定是 GitHub. 同时也可以是一个完整的 GitLab URL
repo: "gdut-yy/Clean-Code-zh",
// 自定义仓库链接文字。默认从 `themeConfig.repo` 中自动推断为
// "GitHub"/"GitLab"/"Bitbucket" 其中之一,或是 "Source"。
repoLabel: "Github",
// 以下为可选的编辑链接选项
// 假如你的文档仓库和项目本身不在一个仓库:
docsRepo: "gdut-yy/Clean-Code-zh",
// 假如文档放在一个特定的分支下:
docsBranch: "master/docs",
// 默认是 false, 设置为 true 来启用
editLinks: true,
// 默认为 "Edit this page"
editLinkText: "帮助我们改善此页面!",
// 最后更新时间
lastUpdated: "Last Updated",
// 最大深度
sidebarDepth: 2,
// 导航栏
nav: [],
// 侧边栏
sidebar: {
"/": [
"",
"ch1.md",
"ch2.md",
"ch3.md",
"ch4.md",
"ch5.md",
"ch6.md",
"ch7.md",
"ch8.md",
"ch9.md",
"ch10.md",
"ch11.md",
"ch12.md",
"ch13.md",
"ch14.md",
"ch15.md",
"ch16.md",
"ch17.md",
"apA.md"
]
}
}
};
================================================
FILE: docs/README.md
================================================
# Clean Code 中文
` so that the formatting that is apparent in the source code is preserved within the Javadoc.1
1. An even better solution would have been for Javadoc to present all comments as preformatted, so that comments appear the same in both code and document.
Line 86 is the class declaration. Why is this class named SerialDate? What is the significance of the word “serial”? Is it because the class is derived from Serializable? That doesn’t seem likely.
I won’t keep you guessing. I know why (or at least I think I know why) the word “serial” was used. The clue is in the constants SERIAL_LOWER_BOUND and SERIAL_UPPER_BOUND on line 98 and line 101. An even better clue is in the comment that begins on line 830. This class is named SerialDate because it is implemented using a “serial number,” which happens to be the number of days since December 30th, 1899.
I have two problems with this. First, the term “serial number” is not really correct. This may be a quibble, but the representation is more of a relative offset than a serial number. The term “serial number” has more to do with product identification markers than dates. So I don’t find this name particularly descriptive [N1]. A more descriptive term might be “ordinal.”
The second problem is more significant. The name SerialDate implies an implementation. This class is an abstract class. There is no need to imply anything at all about the implementation. Indeed, there is good reason to hide the implementation! So I find this name to be at the wrong level of abstraction [N2]. In my opinion, the name of this class should simply be Date.
Unfortunately, there are already too many classes in the Java library named Date, so this is probably not the best name to choose. Because this class is all about days, instead of time, I considered naming it Day, but this name is also heavily used in other places. In the end, I chose DayDate as the best compromise.
From now on in this discussion I will use the term DayDate. I leave it to you to remember that the listings you are looking at still use SerialDate.
I understand why DayDate inherits from Comparable and Serializable. But why does it inherit from MonthConstants? The class MonthConstants (Listing B-3, page 372) is just a bunch of static final constants that define the months. Inheriting from classes with constants is an old trick that Java programmers used so that they could avoid using expressions like MonthConstants.January, but it’s a bad idea [J2]. MonthConstants should really be an enum.
```java
public abstract class DayDate implements Comparable,
Serializable {
public static enum Month {
JANUARY(1),
FEBRUARY(2),
MARCH(3),
APRIL(4),
MAY(5),
JUNE(6),
JULY(7),
AUGUST(8),
SEPTEMBER(9),
OCTOBER(10),
NOVEMBER(11),
DECEMBER(12);
Month(int index) {
this.index = index;
}
public static Month make(int monthIndex) {
for (Month m : Month.values()) {
if (m.index == monthIndex)
return m;
}
throw new IllegalArgumentException(“Invalid month index ” + monthIndex);
}
public final int index;
}
```
Changing MonthConstants to this enum forces quite a few changes to the DayDate class and all it’s users. It took me an hour to make all the changes. However, any function that used to take an int for a month, now takes a Month enumerator. This means we can get rid of the isValidMonthCode method (line 326), and all the month code error checking such as that in monthCodeToQuarter (line 356) [G5].
Next, we have line 91, serialVersionUID. This variable is used to control the serializer. If we change it, then any DayDate written with an older version of the software won’t be readable anymore and will result in an InvalidClassException. If you don’t declare the serialVersionUID variable, then the compiler automatically generates one for you, and it will be different every time you make a change to the module. I know that all the documents recommend manual control of this variable, but it seems to me that automatic control of serialization is a lot safer [G4]. After all, I’d much rather debug an InvalidClassException than the odd behavior that would ensue if I forgot to change the serialVersionUID. So I’m going to delete the variable—at least for the time being.2
2. Several of the reviewers of this text have taken exception to this decision. They contend that in an open source framework it is better to assert manual control over the serial ID so that minor changes to the software don’t cause old serialized dates to be invalid. This is a fair point. However, at least the failure, inconvenient though it might be, has a clear-cut cause. On the other hand, if the author of the class forgets to update the ID, then the failure mode is undefined and might very well be silent. I think the real moral of this story is that you should not expect to deserialize across versions.
I find the comment on line 93 redundant. Redundant comments are just places to collect lies and misinformation [C2]. So I’m going to get rid of it and its ilk.
The comments at line 97 and line 100 talk about serial numbers, which I discussed earlier [C1]. The variables they describe are the earliest and latest possible dates that DayDate can describe. This can be made a bit clearer [N1].
```java
public static final int EARLIEST_DATE_ORDINAL = 2; // 1/1/1900
public static final int LATEST_DATE_ORDINAL = 2958465; // 12/31/9999
```
It’s not clear to me why EARLIEST_DATE_ORDINAL is 2 instead of 0. There is a hint in the comment on line 829 that suggests that this has something to do with the way dates are represented in Microsoft Excel. There is a much deeper insight provided in a derivative of DayDate called SpreadsheetDate (Listing B-5, page 382). The comment on line 71 describes the issue nicely.
The problem I have with this is that the issue seems to be related to the implementation of SpreadsheetDate and has nothing to do with DayDate. I conclude from this that EARLIEST_DATE_ORDINAL and LATEST_DATE_ORDINAL do not really belong in DayDate and should be moved to SpreadsheetDate [G6].
Indeed, a search of the code shows that these variables are used only within SpreadsheetDate. Nothing in DayDate, nor in any other class in the JCommon framework, uses them. Therefore, I’ll move them down into SpreadsheetDate.
The next variables, MINIMUM_YEAR_SUPPORTED, and MAXIMUM_YEAR_SUPPORTED (line 104 and line 107), provide something of a dilemma. It seems clear that if DayDate is an abstract class that provides no foreshadowing of implementation, then it should not inform us about a minimum or maximum year. Again, I am tempted to move these variables down into SpreadsheetDate [G6]. However, a quick search of the users of these variables shows that one other class uses them: RelativeDayOfWeekRule (Listing B-6, page 390). We see that usage at line 177 and line 178 in the getDate function, where they are used to check that the argument to getDate is a valid year. The dilemma is that a user of an abstract class needs information about its implementation.
What we need to do is provide this information without polluting DayDate itself. Usually, we would get implementation information from an instance of a derivative. However, the getDate function is not passed an instance of a DayDate. It does, however, return such an instance, which means that somewhere it must be creating it. Line 187 through line 205 provide the hint. The DayDate instance is being created by one of the three functions, getPreviousDayOfWeek, getNearestDayOfWeek, or getFollowingDayOfWeek. Looking back at the DayDate listing, we see that these functions (lines 638–724) all return a date created by addDays (line 571), which calls createInstance (line 808), which creates a SpreadsheetDate! [G7].
It’s generally a bad idea for base classes to know about their derivatives. To fix this, we should use the ABSTRACT FACTORY3 pattern and create a DayDateFactory. This factory will create the instances of DayDate that we need and can also answer questions about the implementation, such as the maximum and minimum dates.
3. [GOF].
```java
public abstract class DayDateFactory {
private static DayDateFactory factory = new SpreadsheetDateFactory();
public static void setInstance(DayDateFactory factory) {
DayDateFactory.factory = factory;
}
protected abstract DayDate _makeDate(int ordinal);
protected abstract DayDate _makeDate(int day, DayDate.Month month, int year);
protected abstract DayDate _makeDate(int day, int month, int year);
protected abstract DayDate _makeDate(java.util.Date date);
protected abstract int _getMinimumYear();
protected abstract int _getMaximumYear();
public static DayDate makeDate(int ordinal) {
return factory._makeDate(ordinal);
}
public static DayDate makeDate(int day, DayDate.Month month, int year) {
return factory._makeDate(day, month, year);
}
public static DayDate makeDate(int day, int month, int year) {
return factory._makeDate(day, month, year);
}
public static DayDate makeDate(java.util.Date date) {
return factory._makeDate(date);
}
public static int getMinimumYear() {
return factory._getMinimumYear();
}
public static int getMaximumYear() {
return factory._getMaximumYear();
}
}
```
This factory class replaces the createInstance methods with makeDate methods, which improves the names quite a bit [N1]. It defaults to a SpreadsheetDateFactory but can be changed at any time to use a different factory. The static methods that delegate to abstract methods use a combination of the SINGLETON,4 DECORATOR,5 and ABSTRACT FACTORY patterns that I have found to be useful.
4. Ibid.
5. Ibid.
The SpreadsheetDateFactory looks like this.
```java
public class SpreadsheetDateFactory extends DayDateFactory {
public DayDate _makeDate(int ordinal) {
return new SpreadsheetDate(ordinal);
}
public DayDate _makeDate(int day, DayDate.Month month, int year) {
return new SpreadsheetDate(day, month, year);
}
public DayDate _makeDate(int day, int month, int year) {
return new SpreadsheetDate(day, month, year);
}
public DayDate _makeDate(Date date) {
final GregorianCalendar calendar = new GregorianCalendar();
calendar.setTime(date);
return new SpreadsheetDate(
calendar.get(Calendar.DATE),
DayDate.Month.make(calendar.get(Calendar.MONTH) + 1),
calendar.get(Calendar.YEAR));
}
protected int _getMinimumYear() {
return SpreadsheetDate.MINIMUM_YEAR_SUPPORTED;
}
protected int _getMaximumYear() {
return SpreadsheetDate.MAXIMUM_YEAR_SUPPORTED;
}
}
```
As you can see, I have already moved the MINIMUM_YEAR_SUPPORTED and MAXIMUM_YEAR_SUPPORTED variables into SpreadsheetDate, where they belong [G6].
The next issue in DayDate are the day constants beginning at line 109. These should really be another enum [J3]. We’ve seen this pattern before, so I won’t repeat it here. You’ll see it in the final listings.
Next, we see a series of tables starting with LAST_DAY_OF_MONTH at line 140. My first issue with these tables is that the comments that describe them are redundant [C3]. Their names are sufficient. So I’m going to delete the comments.
There seems to be no good reason that this table isn’t private [G8], because there is a static function lastDayOfMonth that provides the same data.
The next table, AGGREGATE_DAYS_TO_END_OF_MONTH, is a bit more mysterious because it is not used anywhere in the JCommon framework [G9]. So I deleted it.
The same goes for LEAP_YEAR_AGGREGATE_DAYS_TO_END_OF_MONTH.
The next table, AGGREGATE_DAYS_TO_END_OF_PRECEDING_MONTH, is used only in Spread-sheetDate (line 434 and line 473). This begs the question of whether it should be moved to SpreadsheetDate. The argument for not moving it is that the table is not specific to any particular implementation [G6]. On the other hand, no implementation other than SpreadsheetDate actually exists, and so the table should be moved close to where it is used [G10].
What settles the argument for me is that to be consistent [G11], we should make the table private and expose it through a function like julianDateOfLastDayOfMonth. Nobody seems to need a function like that. Moreover, the table can be moved back to DayDate easily if any new implementation of DayDate needs it. So I moved it.
The same goes for the table, LEAP_YEAR_AGGREGATE_DAYS_TO_END_OF_MONTH.
Next, we see three sets of constants that can be turned into enums (lines 162–205). The first of the three selects a week within a month. I changed it into an enum named WeekInMonth.
```java
public enum WeekInMonth {
FIRST(1), SECOND(2), THIRD(3), FOURTH(4), LAST(0);
public final int index;
WeekInMonth(int index) {
this.index = index;
}
}
```
The second set of constants (lines 177–187) is a bit more obscure. The INCLUDE_NONE, INCLUDE_FIRST, INCLUDE_SECOND, and INCLUDE_BOTH constants are used to describe whether the defining end-point dates of a range should be included in that range. Mathematically, this is described using the terms “open interval,” “half-open interval,” and “closed interval.” I think it is clearer using the mathematical nomenclature [N3], so I changed it to an enum named DateInterval with CLOSED, CLOSED_LEFT, CLOSED_RIGHT, and OPEN enumerators.
The third set of constants (lines 189–205) describe whether a search for a particular day of the week should result in the last, next, or nearest instance. Deciding what to call this is difficult at best. In the end, I settled for WeekdayRange with LAST, NEXT, and NEAREST enumerators.
You might not agree with the names I’ve chosen. They make sense to me, but they may not make sense to you. The point is that they are now in a form that makes them easy to change [J3]. They aren’t passed as integers anymore; they are passed as symbols. I can use the “change name” function of my IDE to change the names, or the types, without worrying that I missed some -1 or 2 somewhere in the code or that some int argument declaration is left poorly described.
The description field at line 208 does not seem to be used by anyone. I deleted it along with its accessor and mutator [G9].
I also deleted the degenerate default constructor at line 213 [G12]. The compiler will generate it for us.
We can skip over the isValidWeekdayCode method (lines 216–238) because we deleted it when we created the Day enumeration.
This brings us to the stringToWeekdayCode method (lines 242–270). Javadocs that don’t add much to the method signature are just clutter [C3],[G12]. The only value this Javadoc adds is the description of the -1 return value. However, because we changed to the Day enumeration, the comment is actually wrong [C2]. The method now throws an IllegalArgumentException. So I deleted the Javadoc.
I also deleted all the final keywords in arguments and variable declarations. As far as I could tell, they added no real value but did add to the clutter [G12]. Eliminating final flies in the face of some conventional wisdom. For example, Robert Simmons6 strongly recommends us to “. . . spread final all over your code.” Clearly I disagree. I think that there are a few good uses for final, such as the occasional final constant, but otherwise the keyword adds little value and creates a lot of clutter. Perhaps I feel this way because the kinds of errors that final might catch are already caught by the unit tests I write.
6. [Simmons04], p. 73.
I didn’t care for the duplicate if statements [G5] inside the for loop (line 259 and line 263), so I connected them into a single if statement using the || operator. I also used the Day enumeration to direct the for loop and made a few other cosmetic changes.
It occurred to me that this method does not really belong in DayDate. It’s really the parse function of Day. So I moved it into the Day enumeration. However, that made the Day enumeration pretty large. Because the concept of Day does not depend on DayDate, I moved the Day enumeration outside of the DayDate class into its own source file [G13].
I also moved the next function, weekdayCodeToString (lines 272–286) into the Day enumeration and called it toString.
```java
public enum Day {
MONDAY(Calendar.MONDAY),
TUESDAY(Calendar.TUESDAY),
WEDNESDAY(Calendar.WEDNESDAY),s
THURSDAY(Calendar.THURSDAY),
FRIDAY(Calendar.FRIDAY),
SATURDAY(Calendar.SATURDAY),
SUNDAY(Calendar.SUNDAY);
public final int index;
private static DateFormatSymbols dateSymbols = new DateFormatSymbols();
Day(int day) {
index = day;
}
public static Day make(int index) throws IllegalArgumentException {
for (Day d : Day.values())
if (d.index == index)
return d;
throw new IllegalArgumentException(
String.format(“Illegal day index: %d.”, index));
}
public static Day parse(String s) throws IllegalArgumentException {
String[] shortWeekdayNames =
dateSymbols.getShortWeekdays();
String[] weekDayNames =
dateSymbols.getWeekdays();
s = s.trim();
for (Day day : Day.values()) {
if (s.equalsIgnoreCase(shortWeekdayNames[day.index]) ||
s.equalsIgnoreCase(weekDayNames[day.index])) {
return day;
}
}
throw new IllegalArgumentException(
String.format(“%s is not a valid weekday string”, s));
}
public String toString() {
return dateSymbols.getWeekdays()[index];
}
}
```
There are two getMonths functions (lines 288–316). The first calls the second. The second is never called by anyone but the first. Therefore, I collapsed the two into one and vastly simplified them [G9],[G12],[F4]. Finally, I changed the name to be a bit more self-descriptive [N1].
```java
public static String[] getMonthNames() {
return dateFormatSymbols.getMonths();
}
```
The isValidMonthCode function (lines 326–346) was made irrelevant by the Month enum, so I deleted it [G9].
The monthCodeToQuarter function (lines 356–375) smells of FEATURE ENVY7 [G14] and probably belongs in the Month enum as a method named quarter. So I replaced it.
7. [Refactoring].
```java
public int quarter() {
return 1 + (index-1)/3;
}
```
This made the Month enum big enough to be in its own class. So I moved it out of DayDate to be consistent with the Day enum [G11],[G13].
The next two methods are named monthCodeToString (lines 377–426). Again, we see the pattern of one method calling its twin with a flag. It is usually a bad idea to pass a flag as an argument to a function, especially when that flag simply selects the format of the output [G15]. I renamed, simplified, and restructured these functions and moved them into the Month enum [N1],[N3],[C3],[G14].
```java
public String toString() {
return dateFormatSymbols.getMonths()[index - 1];
}
public String toShortString() {
return dateFormatSymbols.getShortMonths()[index - 1];
}
```
The next method is stringToMonthCode (lines 428–472). I renamed it, moved it into the Month enum, and simplified it [N1],[N3],[C3],[G14],[G12].
```java
public static Month parse(String s) {
s = s.trim();
for (Month m : Month.values())
if (m.matches(s))
return m;
try {
return make(Integer.parseInt(s));
}
catch (NumberFormatException e) {}
throw new IllegalArgumentException(“Invalid month ” + s);
}
private boolean matches(String s) {
return s.equalsIgnoreCase(toString()) ||
s.equalsIgnoreCase(toShortString());
}
```
The isLeapYear method (lines 495–517) can be made a bit more expressive [G16].
```java
public static boolean isLeapYear(int year) {
boolean fourth = year % 4 == 0;
boolean hundredth = year % 100 == 0;
boolean fourHundredth = year % 400 == 0;
return fourth && (!hundredth || fourHundredth);
}
```
The next function, leapYearCount (lines 519–536) doesn’t really belong in DayDate. Nobody calls it except for two methods in SpreadsheetDate. So I pushed it down [G6].
The lastDayOfMonth function (lines 538–560) makes use of the LAST_DAY_OF_MONTH array. This array really belongs in the Month enum [G17], so I moved it there. I also simplified the function and made it a bit more expressive [G16].
```java
public static int lastDayOfMonth(Month month, int year) {
if (month == Month.FEBRUARY && isLeapYear(year))
return month.lastDay() + 1;
else
return month.lastDay();
}
```
Now things start to get a bit more interesting. The next function is addDays (lines 562–576). First of all, because this function operates on the variables of DayDate, it should not be static [G18]. So I changed it to an instance method. Second, it calls the function toSerial. This function should be renamed toOrdinal [N1]. Finally, the method can be simplified.
```java
public DayDate addDays(int days) {
return DayDateFactory.makeDate(toOrdinal() + days);
}
```
The same goes for addMonths (lines 578–602). It should be an instance method [G18]. The algorithm is a bit complicated, so I used EXPLAINING TEMPORARY VARIABLES8 [G19] to make it more transparent. I also renamed the method getYYY to getYear [N1].
8. [Beck97].
```java
public DayDate addMonths(int months) {
int thisMonthAsOrdinal = 12 * getYear() + getMonth().index - 1;
int resultMonthAsOrdinal = thisMonthAsOrdinal + months;
int resultYear = resultMonthAsOrdinal / 12;
Month resultMonth = Month.make(resultMonthAsOrdinal % 12 + 1);
int lastDayOfResultMonth = lastDayOfMonth(resultMonth, resultYear);
int resultDay = Math.min(getDayOfMonth(), lastDayOfResultMonth);
return DayDateFactory.makeDate(resultDay, resultMonth, resultYear);
}
```
The addYears function (lines 604–626) provides no surprises over the others.
```java
public DayDate plusYears(int years) {
int resultYear = getYear() + years;
int lastDayOfMonthInResultYear = lastDayOfMonth(getMonth(), resultYear);
int resultDay = Math.min(getDayOfMonth(), lastDayOfMonthInResultYear);
return DayDateFactory.makeDate(resultDay, getMonth(), resultYear);
}
```
There is a little itch at the back of my mind that is bothering me about changing these methods from static to instance. Does the expression date.addDays(5) make it clear that the date object does not change and that a new instance of DayDate is returned? Or does it erroneously imply that we are adding five days to the date object? You might not think that is a big problem, but a bit of code that looks like the following can be very deceiving [G20].
```java
DayDate date = DateFactory.makeDate(5, Month.DECEMBER, 1952);
date.addDays(7); // bump date by one week.
```
Someone reading this code would very likely just accept that addDays is changing the date object. So we need a name that breaks this ambiguity [N4]. So I changed the names to plusDays and plusMonths. It seems to me that the intent of the method is captured nicely by
```java
DayDate date = oldDate.plusDays(5);
```
whereas the following doesn’t read fluidly enough for a reader to simply accept that the date object is changed:
```java
date.plusDays(5);
```
The algorithms continue to get more interesting. getPreviousDayOfWeek (lines 628–660) works but is a bit complicated. After some thought about what was really going on [G21], I was able to simplify it and use EXPLAINING TEMPORARY VARIABLES [G19] to make it clearer. I also changed it from a static method to an instance method [G18], and got rid of the duplicate instance method [G5] (lines 997–1008).
```java
public DayDate getPreviousDayOfWeek(Day targetDayOfWeek) {
int offsetToTarget = targetDayOfWeek.index - getDayOfWeek().index;
if (offsetToTarget >= 0)
offsetToTarget -= 7;
return plusDays(offsetToTarget);
}
```
The exact same analysis and result occurred for getFollowingDayOfWeek (lines 662–693).
```java
public DayDate getFollowingDayOfWeek(Day targetDayOfWeek) {
int offsetToTarget = targetDayOfWeek.index - getDayOfWeek().index;
if (offsetToTarget <= 0)
offsetToTarget += 7;
return plusDays(offsetToTarget);
}
```
The next function is getNearestDayOfWeek (lines 695–726), which we corrected back on page 270. But the changes I made back then aren’t consistent with the current pattern in the last two functions [G11]. So I made it consistent and used some EXPLAINING TEMPORARY VARIABLES [G19] to clarify the algorithm.
```java
public DayDate getNearestDayOfWeek(final Day targetDay) {
int offsetToThisWeeksTarget = targetDay.index - getDayOfWeek().index;
int offsetToFutureTarget = (offsetToThisWeeksTarget + 7) % 7;
int offsetToPreviousTarget = offsetToFutureTarget - 7;
if (offsetToFutureTarget > 3)
return plusDays(offsetToPreviousTarget);
else
return plusDays(offsetToFutureTarget);
}
```
The getEndOfCurrentMonth method (lines 728–740) is a little strange because it is an instance method that envies [G14] its own class by taking a DayDate argument. I made it a true instance method and clarified a few names.
```java
public DayDate getEndOfMonth() {
Month month = getMonth();
int year = getYear();
int lastDay = lastDayOfMonth(month, year);
return DayDateFactory.makeDate(lastDay, month, year);
}
```
Refactoring weekInMonthToString (lines 742–761) turned out to be very interesting indeed. Using the refactoring tools of my IDE, I first moved the method to the WeekInMonth enum that I created back on page 275. Then I renamed the method to toString. Next, I changed it from a static method to an instance method. All the tests still passed. (Can you guess where I am going?)
Next, I deleted the method entirely! Five asserts failed (lines 411–415, Listing B-4, page 374). I changed these lines to use the names of the enumerators (FIRST, SECOND, …). All the tests passed. Can you see why? Can you also see why each of these steps was necessary? The refactoring tool made sure that all previous callers of weekInMonthToString now called toString on the weekInMonth enumerator because all enumerators implement toString to simply return their names.…
Unfortunately, I was a bit too clever. As elegant as that wonderful chain of refactorings was, I finally realized that the only users of this function were the tests I had just modified, so I deleted the tests.
Fool me once, shame on you. Fool me twice, shame on me! So after determining that nobody other than the tests called relativeToString (lines 765–781), I simply deleted the function and its tests.
We have finally made it to the abstract methods of this abstract class. And the first one is as appropriate as they come: toSerial (lines 838–844). Back on page 279 I had changed the name to toOrdinal. Having looked at it in this context, I decided the name should be changed to getOrdinalDay.
The next abstract method is toDate (lines 838–844). It converts a DayDate to a java.util.Date. Why is this method abstract? If we look at its implementation in SpreadsheetDate (lines 198–207, Listing B-5, page 382), we see that it doesn’t depend on anything in the implementation of that class [G6]. So I pushed it up.
The getYYYY, getMonth, and getDayOfMonth methods are nicely abstract. However, the getDayOfWeek method is another one that should be pulled up from SpreadSheetDate because it doesn’t depend on anything that can’t be found in DayDate [G6]. Or does it?
If you look carefully (line 247, Listing B-5, page 382), you’ll see that the algorithm implicitly depends on the origin of the ordinal day (in other words, the day of the week of day 0). So even though this function has no physical dependencies that couldn’t be moved to DayDate, it does have a logical dependency.
Logical dependencies like this bother me [G22]. If something logical depends on the implementation, then something physical should too. Also, it seems to me that the algorithm itself could be generic with a much smaller portion of it dependent on the implementation [G6].
So I created an abstract method in DayDate named getDayOfWeekForOrdinalZero and implemented it in SpreadsheetDate to return Day.SATURDAY. Then I moved the getDayOfWeek method up to DayDate and changed it to call getOrdinalDay and getDayOfWeekForOrdinal-Zero.
```java
public Day getDayOfWeek() {
Day startingDay = getDayOfWeekForOrdinalZero();
int startingOffset = startingDay.index - Day.SUNDAY.index;
return Day.make((getOrdinalDay() + startingOffset) % 7 + 1);
}
```
As a side note, look carefully at the comment on line 895 through line 899. Was this repetition really necessary? As usual, I deleted this comment along with all the others.
The next method is compare (lines 902–913). Again, this method is inappropriately abstract [G6], so I pulled the implementation up into DayDate. Also, the name does not communicate enough [N1]. This method actually returns the difference in days since the argument. So I changed the name to daysSince. Also, I noted that there weren’t any tests for this method, so I wrote them.
The next six functions (lines 915–980) are all abstract methods that should be implemented in DayDate. So I pulled them all up from SpreadsheetDate.
The last function, isInRange (lines 982–995) also needs to be pulled up and refactored. The switch statement is a bit ugly [G23] and can be replaced by moving the cases into the DateInterval enum.
```java
public enum DateInterval {
OPEN {
public boolean isIn(int d, int left, int right) {
return d > left && d < right;
}
},
CLOSED_LEFT {
public boolean isIn(int d, int left, int right) {
return d >= left && d < right;
}
},
CLOSED_RIGHT {
public boolean isIn(int d, int left, int right) {
return d > left && d <= right;
}
},
CLOSED {
public boolean isIn(int d, int left, int right) {
return d >= left && d <= right;
}
};
public abstract boolean isIn(int d, int left, int right);
}
public boolean isInRange(DayDate d1, DayDate d2, DateInterval interval) {
int left = Math.min(d1.getOrdinalDay(), d2.getOrdinalDay());
int right = Math.max(d1.getOrdinalDay(), d2.getOrdinalDay());
return interval.isIn(getOrdinalDay(), left, right);
}
```
That brings us to the end of DayDate. So now we’ll make one more pass over the whole class to see how well it flows.
First, the opening comment is long out of date, so I shortened and improved it [C2].
Next, I moved all the remaining enums out into their own files [G12].
Next, I moved the static variable (dateFormatSymbols) and three static methods (getMonthNames, isLeapYear, lastDayOfMonth) into a new class named DateUtil [G6].
I moved the abstract methods up to the top where they belong [G24].
I changed Month.make to Month.fromInt [N1] and did the same for all the other enums. I also created a toInt() accessor for all the enums and made the index field private.
There was some interesting duplication [G5] in plusYears and plusMonths that I was able to eliminate by extracting a new method named correctLastDayOfMonth, making the all three methods much clearer.
I got rid of the magic number 1 [G25], replacing it with Month.JANUARY.toInt() or Day.SUNDAY.toInt(), as appropriate. I spent a little time with SpreadsheetDate, cleaning up the algorithms a bit. The end result is contained in Listing B-7, page 394, through Listing B-16, page 405.
Interestingly the code coverage in DayDate has decreased to 84.9 percent! This is not because less functionality is being tested; rather it is because the class has shrunk so much that the few uncovered lines have a greater weight. DayDate now has 45 out of 53 executable statements covered by tests. The uncovered lines are so trivial that they weren’t worth testing.
CONCLUSION
So once again we’ve followed the Boy Scout Rule. We’ve checked the code in a bit cleaner than when we checked it out. It took a little time, but it was worth it. Test coverage was increased, some bugs were fixed, the code was clarified and shrunk. The next person to look at this code will hopefully find it easier to deal with than we did. That person will also probably be able to clean it up a bit more than we did.
================================================
FILE: docs/ch17.md
================================================
# 第 17 章 Smells and Heuristics

In his wonderful book Refactoring,1 Martin Fowler identified many different “Code Smells.” The list that follows includes many of Martin’s smells and adds many more of my own. It also includes other pearls and heuristics that I use to practice my trade.
1. [Refactoring].
I compiled this list by walking through several different programs and refactoring them. As I made each change, I asked myself why I made that change and then wrote the reason down here. The result is a rather long list of things that smell bad to me when I read code.
This list is meant to be read from top to bottom and also to be used as a reference. There is a cross-reference for each heuristic that shows you where it is referenced in the rest of the text in “Appendix C” on page 409.
COMMENTS
C1: Inappropriate Information
It is inappropriate for a comment to hold information better held in a different kind of system such as your source code control system, your issue tracking system, or any other record-keeping system. Change histories, for example, just clutter up source files with volumes of historical and uninteresting text. In general, meta-data such as authors, last-modified-date, SPR number, and so on should not appear in comments. Comments should be reserved for technical notes about the code and design.
C2: Obsolete Comment
A comment that has gotten old, irrelevant, and incorrect is obsolete. Comments get old quickly. It is best not to write a comment that will become obsolete. If you find an obsolete comment, it is best to update it or get rid of it as quickly as possible. Obsolete comments tend to migrate away from the code they once described. They become floating islands of irrelevance and misdirection in the code.
C3: Redundant Comment
A comment is redundant if it describes something that adequately describes itself. For example:
```java
i++; // increment i
```
Another example is a Javadoc that says nothing more than (or even less than) the function signature:
```java
/**
* @param sellRequest
* @return
* @throws ManagedComponentException
*/
public SellResponse beginSellItem(SellRequest sellRequest)
throws ManagedComponentException
```
Comments should say things that the code cannot say for itself.
C4: Poorly Written Comment
A comment worth writing is worth writing well. If you are going to write a comment, take the time to make sure it is the best comment you can write. Choose your words carefully. Use correct grammar and punctuation. Don’t ramble. Don’t state the obvious. Be brief.
C5: Commented-Out Code
It makes me crazy to see stretches of code that are commented out. Who knows how old it is? Who knows whether or not it’s meaningful? Yet no one will delete it because everyone assumes someone else needs it or has plans for it.
That code sits there and rots, getting less and less relevant with every passing day. It calls functions that no longer exist. It uses variables whose names have changed. It follows conventions that are long obsolete. It pollutes the modules that contain it and distracts the people who try to read it. Commented-out code is an abomination.
When you see commented-out code, delete it! Don’t worry, the source code control system still remembers it. If anyone really needs it, he or she can go back and check out a previous version. Don’t suffer commented-out code to survive.
ENVIRONMENT
E1: Build Requires More Than One Step
Building a project should be a single trivial operation. You should not have to check many little pieces out from source code control. You should not need a sequence of arcane commands or context dependent scripts in order to build the individual elements. You should not have to search near and far for all the various little extra JARs, XML files, and other artifacts that the system requires. You should be able to check out the system with one simple command and then issue one other simple command to build it.
```sh
svn get mySystem
cd mySystem
ant all
```
E2: Tests Require More Than One Step
You should be able to run all the unit tests with just one command. In the best case you can run all the tests by clicking on one button in your IDE. In the worst case you should be able to issue a single simple command in a shell. Being able to run all the tests is so fundamental and so important that it should be quick, easy, and obvious to do.
FUNCTIONS
F1: Too Many Arguments
Functions should have a small number of arguments. No argument is best, followed by one, two, and three. More than three is very questionable and should be avoided with prejudice. (See “Function Arguments” on page 40.)
F2: Output Arguments
Output arguments are counterintuitive. Readers expect arguments to be inputs, not outputs. If your function must change the state of something, have it change the state of the object it is called on. (See “Output Arguments” on page 45.)
F3: Flag Arguments
Boolean arguments loudly declare that the function does more than one thing. They are confusing and should be eliminated. (See “Flag Arguments” on page 41.)
F4: Dead Function
Methods that are never called should be discarded. Keeping dead code around is wasteful. Don’t be afraid to delete the function. Remember, your source code control system still remembers it.
GENERAL
G1: Multiple Languages in One Source File
Today’s modern programming environments make it possible to put many different languages into a single source file. For example, a Java source file might contain snippets of XML, HTML, YAML, JavaDoc, English, JavaScript, and so on. For another example, in addition to HTML a JSP file might contain Java, a tag library syntax, English comments, Javadocs, XML, JavaScript, and so forth. This is confusing at best and carelessly sloppy at worst.
The ideal is for a source file to contain one, and only one, language. Realistically, we will probably have to use more than one. But we should take pains to minimize both the number and extent of extra languages in our source files.
G2: Obvious Behavior Is Unimplemented
Following “The Principle of Least Surprise,”2 any function or class should implement the behaviors that another programmer could reasonably expect. For example, consider a function that translates the name of a day to an enum that represents the day.
2. Or “The Principle of Least Astonishment”: http://en.wikipedia.org/wiki/
Principle_of_least_astonishment
```java
Day day = DayDate.StringToDay(String dayName);
```
We would expect the string “Monday” to be translated to Day.MONDAY. We would also expect the common abbreviations to be translated, and we would expect the function to ignore case.
When an obvious behavior is not implemented, readers and users of the code can no longer depend on their intuition about function names. They lose their trust in the original author and must fall back on reading the details of the code.
G3: Incorrect Behavior at the Boundaries
It seems obvious to say that code should behave correctly. The problem is that we seldom realize just how complicated correct behavior is. Developers often write functions that they think will work, and then trust their intuition rather than going to the effort to prove that their code works in all the corner and boundary cases.
There is no replacement for due diligence. Every boundary condition, every corner case, every quirk and exception represents something that can confound an elegant and intuitive algorithm. Don’t rely on your intuition. Look for every boundary condition and write a test for it.
G4: Overridden Safeties
Chernobyl melted down because the plant manager overrode each of the safety mechanisms one by one. The safeties were making it inconvenient to run an experiment. The result was that the experiment did not get run, and the world saw it’s first major civilian nuclear catastrophe.
It is risky to override safeties. Exerting manual control over serialVersionUID may be necessary, but it is always risky. Turning off certain compiler warnings (or all warnings!) may help you get the build to succeed, but at the risk of endless debugging sessions. Turning off failing tests and telling yourself you’ll get them to pass later is as bad as pretending your credit cards are free money.
G5: Duplication
This is one of the most important rules in this book, and you should take it very seriously. Virtually every author who writes about software design mentions this rule. Dave Thomas and Andy Hunt called it the DRY3 principle (Don’t Repeat Yourself). Kent Beck made it one of the core principles of Extreme Programming and called it: “Once, and only once.” Ron Jeffries ranks this rule second, just below getting all the tests to pass.
3. [PRAG].
Every time you see duplication in the code, it represents a missed opportunity for abstraction. That duplication could probably become a subroutine or perhaps another class outright. By folding the duplication into such an abstraction, you increase the vocabulary of the language of your design. Other programmers can use the abstract facilities you create. Coding becomes faster and less error prone because you have raised the abstraction level.
The most obvious form of duplication is when you have clumps of identical code that look like some programmers went wild with the mouse, pasting the same code over and over again. These should be replaced with simple methods.
A more subtle form is the switch/case or if/else chain that appears again and again in various modules, always testing for the same set of conditions. These should be replaced with polymorphism.
Still more subtle are the modules that have similar algorithms, but that don’t share similar lines of code. This is still duplication and should be addressed by using the TEMPLATE METHOD,4 or STRATEGY5 pattern.
4. [GOF].
5. [GOF].
Indeed, most of the design patterns that have appeared in the last fifteen years are simply well-known ways to eliminate duplication. So too the Codd Normal Forms are a strategy for eliminating duplication in database schemae. OO itself is a strategy for organizing modules and eliminating duplication. Not surprisingly, so is structured programming.
I think the point has been made. Find and eliminate duplication wherever you can.
G6: Code at Wrong Level of Abstraction
It is important to create abstractions that separate higher level general concepts from lower level detailed concepts. Sometimes we do this by creating abstract classes to hold the higher level concepts and derivatives to hold the lower level concepts. When we do this, we need to make sure that the separation is complete. We want all the lower level concepts to be in the derivatives and all the higher level concepts to be in the base class.
For example, constants, variables, or utility functions that pertain only to the detailed implementation should not be present in the base class. The base class should know nothing about them.
This rule also pertains to source files, components, and modules. Good software design requires that we separate concepts at different levels and place them in different containers. Sometimes these containers are base classes or derivatives and sometimes they are source files, modules, or components. Whatever the case may be, the separation needs to be complete. We don’t want lower and higher level concepts mixed together.
Consider the following code:
```java
public interface Stack {
Object pop() throws EmptyException;
void push(Object o) throws FullException;
double percentFull();
class EmptyException extends Exception {}
class FullException extends Exception {}
}
```
The percentFull function is at the wrong level of abstraction. Although there are many implementations of Stack where the concept of fullness is reasonable, there are other implementations that simply could not know how full they are. So the function would be better placed in a derivative interface such as BoundedStack.
Perhaps you are thinking that the implementation could just return zero if the stack were boundless. The problem with that is that no stack is truly boundless. You cannot really prevent an OutOfMemoryException by checking for
```java
stack.percentFull() < 50.0.
```
Implementing the function to return 0 would be telling a lie.
The point is that you cannot lie or fake your way out of a misplaced abstraction. Isolating abstractions is one of the hardest things that software developers do, and there is no quick fix when you get it wrong.
G7: Base Classes Depending on Their Derivatives
The most common reason for partitioning concepts into base and derivative classes is so that the higher level base class concepts can be independent of the lower level derivative class concepts. Therefore, when we see base classes mentioning the names of their derivatives, we suspect a problem. In general, base classes should know nothing about their derivatives.
There are exceptions to this rule, of course. Sometimes the number of derivatives is strictly fixed, and the base class has code that selects between the derivatives. We see this a lot in finite state machine implementations. However, in that case the derivatives and base class are strongly coupled and always deploy together in the same jar file. In the general case we want to be able to deploy derivatives and bases in different jar files.
Deploying derivatives and bases in different jar files and making sure the base jar files know nothing about the contents of the derivative jar files allow us to deploy our systems in discrete and independent components. When such components are modified, they can be redeployed without having to redeploy the base components. This means that the impact of a change is greatly lessened, and maintaining systems in the field is made much simpler.
G8: Too Much Information
Well-defined modules have very small interfaces that allow you to do a lot with a little. Poorly defined modules have wide and deep interfaces that force you to use many different gestures to get simple things done. A well-defined interface does not offer very many functions to depend upon, so coupling is low. A poorly defined interface provides lots of functions that you must call, so coupling is high.
Good software developers learn to limit what they expose at the interfaces of their classes and modules. The fewer methods a class has, the better. The fewer variables a function knows about, the better. The fewer instance variables a class has, the better.
Hide your data. Hide your utility functions. Hide your constants and your temporaries. Don’t create classes with lots of methods or lots of instance variables. Don’t create lots of protected variables and functions for your subclasses. Concentrate on keeping interfaces very tight and very small. Help keep coupling low by limiting information.
G9: Dead Code
Dead code is code that isn’t executed. You find it in the body of an if statement that checks for a condition that can’t happen. You find it in the catch block of a try that never throws. You find it in little utility methods that are never called or switch/case conditions that never occur.
The problem with dead code is that after awhile it starts to smell. The older it is, the stronger and sourer the odor becomes. This is because dead code is not completely updated when designs change. It still compiles, but it does not follow newer conventions or rules. It was written at a time when the system was different. When you find dead code, do the right thing. Give it a decent burial. Delete it from the system.
G10: Vertical Separation
Variables and function should be defined close to where they are used. Local variables should be declared just above their first usage and should have a small vertical scope. We don’t want local variables declared hundreds of lines distant from their usages.
Private functions should be defined just below their first usage. Private functions belong to the scope of the whole class, but we’d still like to limit the vertical distance between the invocations and definitions. Finding a private function should just be a matter of scanning downward from the first usage.
G11: Inconsistency
If you do something a certain way, do all similar things in the same way. This goes back to the principle of least surprise. Be careful with the conventions you choose, and once chosen, be careful to continue to follow them.
If within a particular function you use a variable named response to hold an HttpServletResponse, then use the same variable name consistently in the other functions that use HttpServletResponse objects. If you name a method processVerificationRequest, then use a similar name, such as processDeletionRequest, for the methods that process other kinds of requests.
Simple consistency like this, when reliably applied, can make code much easier to read and modify.
G12: Clutter
Of what use is a default constructor with no implementation? All it serves to do is clutter up the code with meaningless artifacts. Variables that aren’t used, functions that are never called, comments that add no information, and so forth. All these things are clutter and should be removed. Keep your source files clean, well organized, and free of clutter.
G13: Artificial Coupling
Things that don’t depend upon each other should not be artificially coupled. For example, general enums should not be contained within more specific classes because this forces the whole application to know about these more specific classes. The same goes for general purpose static functions being declared in specific classes.
In general an artificial coupling is a coupling between two modules that serves no direct purpose. It is a result of putting a variable, constant, or function in a temporarily convenient, though inappropriate, location. This is lazy and careless.
Take the time to figure out where functions, constants, and variables ought to be declared. Don’t just toss them in the most convenient place at hand and then leave them there.
G14: Feature Envy
This is one of Martin Fowler’s code smells.6 The methods of a class should be interested in the variables and functions of the class they belong to, and not the variables and functions of other classes. When a method uses accessors and mutators of some other object to manipulate the data within that object, then it envies the scope of the class of that other object. It wishes that it were inside that other class so that it could have direct access to the variables it is manipulating. For example:
6. [Refactoring].
```java
public class HourlyPayCalculator {
public Money calculateWeeklyPay(HourlyEmployee e) {
int tenthRate = e.getTenthRate().getPennies();
int tenthsWorked = e.getTenthsWorked();
int straightTime = Math.min(400, tenthsWorked);
int overTime = Math.max(0, tenthsWorked - straightTime);
int straightPay = straightTime * tenthRate;
int overtimePay = (int)Math.round(overTime*tenthRate*1.5);
return new Money(straightPay + overtimePay);
}
}
```
The calculateWeeklyPay method reaches into the HourlyEmployee object to get the data on which it operates. The calculateWeeklyPay method envies the scope of HourlyEmployee. It “wishes” that it could be inside HourlyEmployee.
All else being equal, we want to eliminate Feature Envy because it exposes the internals of one class to another. Sometimes, however, Feature Envy is a necessary evil. Consider the following:
```java
public class HourlyEmployeeReport {
private HourlyEmployee employee ;
public HourlyEmployeeReport(HourlyEmployee e) {
this.employee = e;
}
String reportHours() {
return String.format(
“Name: %s\tHours:%d.%1d\n”,
employee.getName(),
employee.getTenthsWorked()/10,
employee.getTenthsWorked()%10);
}
}
```
Clearly, the reportHours method envies the HourlyEmployee class. On the other hand, we don’t want HourlyEmployee to have to know about the format of the report. Moving that format string into the HourlyEmployee class would violate several principles of object oriented design.7 It would couple HourlyEmployee to the format of the report, exposing it to changes in that format.
7. Specifically, the Single Responsibility Principle, the Open Closed Principle, and the Common Closure Principle. See [PPP].
G15: Selector Arguments
There is hardly anything more abominable than a dangling false argument at the end of a function call. What does it mean? What would it change if it were true? Not only is the purpose of a selector argument difficult to remember, each selector argument combines many functions into one. Selector arguments are just a lazy way to avoid splitting a large function into several smaller functions. Consider:
```java
public int calculateWeeklyPay(boolean overtime) {
int tenthRate = getTenthRate();
int tenthsWorked = getTenthsWorked();
int straightTime = Math.min(400, tenthsWorked);
int overTime = Math.max(0, tenthsWorked - straightTime);
int straightPay = straightTime * tenthRate;
double overtimeRate = overtime ? 1.5 : 1.0 * tenthRate;
int overtimePay = (int)Math.round(overTime*overtimeRate);
return straightPay + overtimePay;
}
```
You call this function with a true if overtime is paid as time and a half, and with a false if overtime is paid as straight time. It’s bad enough that you must remember what calculateWeeklyPay(false) means whenever you happen to stumble across it. But the real shame of a function like this is that the author missed the opportunity to write the following:
```java
public int straightPay() {
return getTenthsWorked() * getTenthRate();
}
public int overTimePay() {
int overTimeTenths = Math.max(0, getTenthsWorked() - 400);
int overTimePay = overTimeBonus(overTimeTenths);
return straightPay() + overTimePay;
}
private int overTimeBonus(int overTimeTenths) {
double bonus = 0.5 * getTenthRate() * overTimeTenths;
return (int) Math.round(bonus);
}
```
Of course, selectors need not be boolean. They can be enums, integers, or any other type of argument that is used to select the behavior of the function. In general it is better to have many functions than to pass some code into a function to select the behavior.
G16: Obscured Intent
We want code to be as expressive as possible. Run-on expressions, Hungarian notation, and magic numbers all obscure the author’s intent. For example, here is the overTimePay function as it might have appeared:
```java
public int m_otCalc() {
return iThsWkd * iThsRte +
(int) Math.round(0.5 * iThsRte *
Math.max(0, iThsWkd - 400)
);
}
```
Small and dense as this might appear, it’s also virtually impenetrable. It is worth taking the time to make the intent of our code visible to our readers.
G17: Misplaced Responsibility
One of the most important decisions a software developer can make is where to put code. For example, where should the PI constant go? Should it be in the Math class? Perhaps it belongs in the Trigonometry class? Or maybe in the Circle class?
The principle of least surprise comes into play here. Code should be placed where a reader would naturally expect it to be. The PI constant should go where the trig functions are declared. The OVERTIME_RATE constant should be declared in the HourlyPay-Calculator class.
Sometimes we get “clever” about where to put certain functionality. We’ll put it in a function that’s convenient for us, but not necessarily intuitive to the reader. For example, perhaps we need to print a report with the total of hours that an employee worked. We could sum up those hours in the code that prints the report, or we could try to keep a running total in the code that accepts time cards.
One way to make this decision is to look at the names of the functions. Let’s say that our report module has a function named getTotalHours. Let’s also say that the module that accepts time cards has a saveTimeCard function. Which of these two functions, by it’s name, implies that it calculates the total? The answer should be obvious.
Clearly, there are sometimes performance reasons why the total should be calculated as time cards are accepted rather than when the report is printed. That’s fine, but the names of the functions ought to reflect this. For example, there should be a computeRunning-TotalOfHours function in the timecard module.
G18: Inappropriate Static
Math.max(double a, double b) is a good static method. It does not operate on a single instance; indeed, it would be silly to have to say new Math().max(a,b) or even a.max(b). All the data that max uses comes from its two arguments, and not from any “owning” object. More to the point, there is almost no chance that we’d want Math.max to be polymorphic.
Sometimes, however, we write static functions that should not be static. For example, consider:
```java
HourlyPayCalculator.calculatePay(employee, overtimeRate).
```
Again, this seems like a reasonable static function. It doesn’t operate on any particular object and gets all it’s data from it’s arguments. However, there is a reasonable chance that we’ll want this function to be polymorphic. We may wish to implement several different algorithms for calculating hourly pay, for example, OvertimeHourlyPayCalculator and StraightTimeHourlyPayCalculator. So in this case the function should not be static. It should be a nonstatic member function of Employee.
In general you should prefer nonstatic methods to static methods. When in doubt, make the function nonstatic. If you really want a function to be static, make sure that there is no chance that you’ll want it to behave polymorphically.
G19: Use Explanatory Variables
Kent Beck wrote about this in his great book Smalltalk Best Practice Patterns8 and again more recently in his equally great book Implementation Patterns.9 One of the more powerful ways to make a program readable is to break the calculations up into intermediate values that are held in variables with meaningful names.
8. [Beck97], p. 108.
9. [Beck07].
Consider this example from FitNesse:
```java
Matcher match = headerPattern.matcher(line);
if(match.find())
{
String key = match.group(1);
String value = match.group(2);
headers.put(key.toLowerCase(), value);
}
```
The simple use of explanatory variables makes it clear that the first matched group is the key, and the second matched group is the value.
It is hard to overdo this. More explanatory variables are generally better than fewer. It is remarkable how an opaque module can suddenly become transparent simply by breaking the calculations up into well-named intermediate values.
G20: Function Names Should Say What They Do
Look at this code:
```java
Date newDate = date.add(5);
```
Would you expect this to add five days to the date? Or is it weeks, or hours? Is the date instance changed or does the function just return a new Date without changing the old one? You can’t tell from the call what the function does.
If the function adds five days to the date and changes the date, then it should be called addDaysTo or increaseByDays. If, on the other hand, the function returns a new date that is five days later but does not change the date instance, it should be called daysLater or daysSince.
If you have to look at the implementation (or documentation) of the function to know what it does, then you should work to find a better name or rearrange the functionality so that it can be placed in functions with better names.
G21: Understand the Algorithm
Lots of very funny code is written because people don’t take the time to understand the algorithm. They get something to work by plugging in enough if statements and flags, without really stopping to consider what is really going on.
Programming is often an exploration. You think you know the right algorithm for something, but then you wind up fiddling with it, prodding and poking at it, until you get it to “work.” How do you know it “works”? Because it passes the test cases you can think of.
There is nothing wrong with this approach. Indeed, often it is the only way to get a function to do what you think it should. However, it is not sufficient to leave the quotation marks around the word “work.”
Before you consider yourself to be done with a function, make sure you understand how it works. It is not good enough that it passes all the tests. You must know10 that the solution is correct.
10. There is a difference between knowing how the code works and knowing whether the algorithm will do the job required of it. Being unsure that an algorithm is appropriate is often a fact of life. Being unsure what your code does is just laziness.
Often the best way to gain this knowledge and understanding is to refactor the function into something that is so clean and expressive that it is obvious how it works.
G22: Make Logical Dependencies Physical
If one module depends upon another, that dependency should be physical, not just logical. The dependent module should not make assumptions (in other words, logical dependencies) about the module it depends upon. Rather it should explicitly ask that module for all the information it depends upon.
For example, imagine that you are writing a function that prints a plain text report of hours worked by employees. One class named HourlyReporter gathers all the data into a convenient form and then passes it to HourlyReportFormatter to print it. (See Listing 17-1.)
Listing 17-1 HourlyReporter.java
```java
public class HourlyReporter {
private HourlyReportFormatter formatter;
private List page;
private final int PAGE_SIZE = 55;
public HourlyReporter(HourlyReportFormatter formatter) {
this.formatter = formatter;
page = new ArrayList();
}
public void generateReport(List employees) {
for (HourlyEmployee e : employees) {
addLineItemToPage(e);
if (page.size() == PAGE_SIZE)
printAndClearItemList();
}
if (page.size() > 0)
printAndClearItemList();
}
private void printAndClearItemList() {
formatter.format(page);
page.clear();
}
private void addLineItemToPage(HourlyEmployee e) {
LineItem item = new LineItem();
item.name = e.getName();
item.hours = e.getTenthsWorked() / 10;
item.tenths = e.getTenthsWorked() % 10;
page.add(item);
}
public class LineItem {
public String name;
public int hours;
public int tenths;
}
}
```
This code has a logical dependency that has not been physicalized. Can you spot it? It is the constant PAGE_SIZE. Why should the HourlyReporter know the size of the page? Page size should be the responsibility of the HourlyReportFormatter.
The fact that PAGE_SIZE is declared in HourlyReporter represents a misplaced responsibility [G17] that causes HourlyReporter to assume that it knows what the page size ought to be. Such an assumption is a logical dependency. HourlyReporter depends on the fact that HourlyReportFormatter can deal with page sizes of 55. If some implementation of HourlyReportFormatter could not deal with such sizes, then there would be an error.
We can physicalize this dependency by creating a new method in HourlyReport-Formatter named getMaxPageSize(). HourlyReporter will then call that function rather than using the PAGE_SIZE constant.
G23: Prefer Polymorphism to If/Else or Switch/Case
This might seem a strange suggestion given the topic of Chapter 6. After all, in that chapter I make the point that switch statements are probably appropriate in the parts of the system where adding new functions is more likely than adding new types.
First, most people use switch statements because it’s the obvious brute force solution, not because it’s the right solution for the situation. So this heuristic is here to remind us to consider polymorphism before using a switch.
Second, the cases where functions are more volatile than types are relatively rare. So every switch statement should be suspect.
I use the following “ONE SWITCH” rule: There may be no more than one switch statement for a given type of selection. The cases in that switch statement must create polymorphic objects that take the place of other such switch statements in the rest of the system.
G24: Follow Standard Conventions
Every team should follow a coding standard based on common industry norms. This coding standard should specify things like where to declare instance variables; how to name classes, methods, and variables; where to put braces; and so on. The team should not need a document to describe these conventions because their code provides the examples.
Everyone on the team should follow these conventions. This means that each team member must be mature enough to realize that it doesn’t matter a whit where you put your braces so long as you all agree on where to put them.
If you would like to know what conventions I follow, you’ll see them in the refactored code in Listing B-7 on page 394, through Listing B-14.
G25: Replace Magic Numbers with Named Constants
This is probably one of the oldest rules in software development. I remember reading it in the late sixties in introductory COBOL, FORTRAN, and PL/1 manuals. In general it is a bad idea to have raw numbers in your code. You should hide them behind well-named constants.
For example, the number 86,400 should be hidden behind the constant SECONDS_PER_DAY. If you are printing 55 lines per page, then the constant 55 should be hidden behind the constant LINES_PER_PAGE.
Some constants are so easy to recognize that they don’t always need a named constant to hide behind so long as they are used in conjunction with very self-explanatory code. For example:
```java
double milesWalked = feetWalked/5280.0;
int dailyPay = hourlyRate * 8;
double circumference = radius * Math.PI * 2;
```
Do we really need the constants FEET_PER_MILE, WORK_HOURS_PER_DAY, and TWO in the above examples? Clearly, the last case is absurd. There are some formulae in which constants are simply better written as raw numbers. You might quibble about the WORK_HOURS_PER_DAY case because the laws or conventions might change. On the other hand, that formula reads so nicely with the 8 in it that I would be reluctant to add 17 extra characters to the readers’ burden. And in the FEET_PER_MILE case, the number 5280 is so very well known and so unique a constant that readers would recognize it even if it stood alone on a page with no context surrounding it.
Constants like 3.141592653589793 are also very well known and easily recognizable. However, the chance for error is too great to leave them raw. Every time someone sees 3.1415927535890793, they know that it is π, and so they fail to scrutinize it. (Did you catch the single-digit error?) We also don’t want people using 3.14, 3.14159, 3.142, and so forth. Therefore, it is a good thing that Math.PI has already been defined for us.
The term “Magic Number” does not apply only to numbers. It applies to any token that has a value that is not self-describing. For example:
```java
assertEquals(7777, Employee.find(“John Doe”).employeeNumber());
```
There are two magic numbers in this assertion. The first is obviously 7777, though what it might mean is not obvious. The second magic number is “John Doe,” and again the intent is not clear.
It turns out that “John Doe” is the name of employee #7777 in a well-known test database created by our team. Everyone in the team knows that when you connect to this database, it will have several employees already cooked into it with well-known values and attributes. It also turns out that “John Doe” represents the sole hourly employee in that test database. So this test should really read:
```java
assertEquals(
HOURLY_EMPLOYEE_ID,
Employee.find(HOURLY_EMPLOYEE_NAME).employeeNumber());
```
G26: Be Precise
Expecting the first match to be the only match to a query is probably naive. Using floating point numbers to represent currency is almost criminal. Avoiding locks and/or transaction management because you don’t think concurrent update is likely is lazy at best. Declaring a variable to be an ArrayList when a List will due is overly constraining. Making all variables protected by default is not constraining enough.
When you make a decision in your code, make sure you make it precisely. Know why you have made it and how you will deal with any exceptions. Don’t be lazy about the precision of your decisions. If you decide to call a function that might return null, make sure you check for null. If you query for what you think is the only record in the database, make sure your code checks to be sure there aren’t others. If you need to deal with currency, use integers11 and deal with rounding appropriately. If there is the possibility of concurrent update, make sure you implement some kind of locking mechanism.
11. Or better yet, a Money class that uses integers.
Ambiguities and imprecision in code are either a result of disagreements or laziness. In either case they should be eliminated.
G27: Structure over Convention
Enforce design decisions with structure over convention. Naming conventions are good, but they are inferior to structures that force compliance. For example, switch/cases with nicely named enumerations are inferior to base classes with abstract methods. No one is forced to implement the switch/case statement the same way each time; but the base classes do enforce that concrete classes have all abstract methods implemented.
G28: Encapsulate Conditionals
Boolean logic is hard enough to understand without having to see it in the context of an if or while statement. Extract functions that explain the intent of the conditional.
For example:
```java
if (shouldBeDeleted(timer))
```
is preferable to
```java
if (timer.hasExpired() && !timer.isRecurrent())
```
G29: Avoid Negative Conditionals
Negatives are just a bit harder to understand than positives. So, when possible, conditionals should be expressed as positives. For example:
```java
if (buffer.shouldCompact())
```
is preferable to
```java
if (!buffer.shouldNotCompact())
```
G30: Functions Should Do One Thing
It is often tempting to create functions that have multiple sections that perform a series of operations. Functions of this kind do more than one thing, and should be converted into many smaller functions, each of which does one thing.
For example:
```java
public void pay() {
for (Employee e : employees) {
if (e.isPayday()) {
Money pay = e.calculatePay();
e.deliverPay(pay);
}
}
}
```
This bit of code does three things. It loops over all the employees, checks to see whether each employee ought to be paid, and then pays the employee. This code would be better written as:
```java
public void pay() {
for (Employee e : employees)
payIfNecessary(e);
}
private void payIfNecessary(Employee e) {
if (e.isPayday())
calculateAndDeliverPay(e);
}
private void calculateAndDeliverPay(Employee e) {
Money pay = e.calculatePay();
e.deliverPay(pay);
}
```
Each of these functions does one thing. (See “Do One Thing” on page 35.)
G31: Hidden Temporal Couplings
Temporal couplings are often necessary, but you should not hide the coupling. Structure the arguments of your functions such that the order in which they should be called is obvious. Consider the following:
```java
public class MoogDiver {
Gradient gradient;
List splines;
public void dive(String reason) {
saturateGradient();
reticulateSplines();
diveForMoog(reason);
}
…
}
```
The order of the three functions is important. You must saturate the gradient before you can reticulate the splines, and only then can you dive for the moog. Unfortunately, the code does not enforce this temporal coupling. Another programmer could call reticulate-Splines before saturateGradient was called, leading to an UnsaturatedGradientException. A better solution is:
```java
public class MoogDiver {
Gradient gradient;
List splines;
public void dive(String reason) {
Gradient gradient = saturateGradient();
List splines = reticulateSplines(gradient);
diveForMoog(splines, reason);
}
…
}
```
This exposes the temporal coupling by creating a bucket brigade. Each function produces a result that the next function needs, so there is no reasonable way to call them out of order.
You might complain that this increases the complexity of the functions, and you’d be right. But that extra syntactic complexity exposes the true temporal complexity of the situation.
Note that I left the instance variables in place. I presume that they are needed by private methods in the class. Even so, I want the arguments in place to make the temporal coupling explicit.
G32: Don’t Be Arbitrary
Have a reason for the way you structure your code, and make sure that reason is communicated by the structure of the code. If a structure appears arbitrary, others will feel empowered to change it. If a structure appears consistently throughout the system, others will use it and preserve the convention. For example, I was recently merging changes to FitNesse and discovered that one of our committers had done this:
```java
public class AliasLinkWidget extends ParentWidget
{
public static class VariableExpandingWidgetRoot {
…
…
}
```
The problem with this was that VariableExpandingWidgetRoot had no need to be inside the scope of AliasLinkWidget. Moreover, other unrelated classes made use of AliasLinkWidget.VariableExpandingWidgetRoot. These classes had no need to know about AliasLinkWidget.
Perhaps the programmer had plopped the VariableExpandingWidgetRoot into AliasWidget as a matter of convenience, or perhaps he thought it really needed to be scoped inside AliasWidget. Whatever the reason, the result wound up being arbitrary. Public classes that are not utilities of some other class should not be scoped inside another class. The convention is to make them public at the top level of their package.
G33: Encapsulate Boundary Conditions
Boundary conditions are hard to keep track of. Put the processing for them in one place. Don’t let them leak all over the code. We don’t want swarms of +1s and -1s scattered hither and yon. Consider this simple example from FIT:
```java
if(level + 1 < tags.length)
{
parts = new Parse(body, tags, level + 1, offset + endTag);
body = null;
}
```
Notice that level+1 appears twice. This is a boundary condition that should be encapsulated within a variable named something like nextLevel.
```java
int nextLevel = level + 1;
if(nextLevel < tags.length)
{
parts = new Parse(body, tags, nextLevel, offset + endTag);
body = null;
}
```
G34: Functions Should Descend Only One Level of Abstraction
The statements within a function should all be written at the same level of abstraction, which should be one level below the operation described by the name of the function. This may be the hardest of these heuristics to interpret and follow. Though the idea is plain enough, humans are just far too good at seamlessly mixing levels of abstraction. Consider, for example, the following code taken from FitNesse:
```java
public String render() throws Exception
{
StringBuffer html = new StringBuffer(“
0)
html.append(” size=\“”).append(size + 1).append(”\“”);
html.append(“>”);
return html.toString();
}
```
A moment’s study and you can see what’s going on. This function constructs the HTML tag that draws a horizontal rule across the page. The height of that rule is specified in the size variable.
Now look again. This method is mixing at least two levels of abstraction. The first is the notion that a horizontal rule has a size. The second is the syntax of the HR tag itself. This code comes from the HruleWidget module in FitNesse. This module detects a row of four or more dashes and converts it into the appropriate HR tag. The more dashes, the larger the size.
I refactored this bit of code as follows. Note that I changed the name of the size field to reflect its true purpose. It held the number of extra dashes.
```java
public String render() throws Exception
{
HtmlTag hr = new HtmlTag(“hr”);
if (extraDashes > 0)
hr.addAttribute(“size”, hrSize(extraDashes));
return hr.html();
}
private String hrSize(int height)
{
int hrSize = height + 1;
return String.format(“%d”, hrSize);
}
```
This change separates the two levels of abstraction nicely. The render function simply constructs an HR tag, without having to know anything about the HTML syntax of that tag. The HtmlTag module takes care of all the nasty syntax issues.
Indeed, by making this change I caught a subtle error. The original code did not put the closing slash on the HR tag, as the XHTML standard would have it. (In other words, it emitted
instead of
.) The HtmlTag module had been changed to conform to XHTML long ago.
Separating levels of abstraction is one of the most important functions of refactoring, and it’s one of the hardest to do well. As an example, look at the code below. This was my first attempt at separating the abstraction levels in the HruleWidget.render method.
```java
public String render() throws Exception
{
HtmlTag hr = new HtmlTag(“hr”);
if (size > 0) {
hr.addAttribute(“size”, “”+(size+1));
}
return hr.html();
}
```
My goal, at this point, was to create the necessary separation and get the tests to pass. I accomplished that goal easily, but the result was a function that still had mixed levels of abstraction. In this case the mixed levels were the construction of the HR tag and the interpretation and formatting of the size variable. This points out that when you break a function along lines of abstraction, you often uncover new lines of abstraction that were obscured by the previous structure.
G35: Keep Configurable Data at High Levels
If you have a constant such as a default or configuration value that is known and expected at a high level of abstraction, do not bury it in a low-level function. Expose it as an argument to that low-level function called from the high-level function. Consider the following code from FitNesse:
```java
public static void main(String[] args) throws Exception
{
Arguments arguments = parseCommandLine(args);
…
}
public class Arguments
{
public static final String DEFAULT_PATH = “.”;
public static final String DEFAULT_ROOT = “FitNesseRoot”;
public static final int DEFAULT_PORT = 80;
public static final int DEFAULT_VERSION_DAYS = 14;
…
}
```
The command-line arguments are parsed in the very first executable line of FitNesse. The default values of those arguments are specified at the top of the Argument class. You don’t have to go looking in low levels of the system for statements like this one:
```java
if (arguments.port == 0) // use 80 by default
```
The configuration constants reside at a very high level and are easy to change. They get passed down to the rest of the application. The lower levels of the application do not own the values of these constants.
G36: Avoid Transitive Navigation
In general we don’t want a single module to know much about its collaborators. More specifically, if A collaborates with B, and B collaborates with C, we don’t want modules that use A to know about C. (For example, we don’t want a.getB().getC().doSomething();.)
This is sometimes called the Law of Demeter. The Pragmatic Programmers call it “Writing Shy Code.”12 In either case it comes down to making sure that modules know only about their immediate collaborators and do not know the navigation map of the whole system.
12. [PRAG], p. 138.
If many modules used some form of the statement a.getB().getC(), then it would be difficult to change the design and architecture to interpose a Q between B and C. You’d have to find every instance of a.getB().getC() and convert it to a.getB().getQ().getC(). This is how architectures become rigid. Too many modules know too much about the architecture.
Rather we want our immediate collaborators to offer all the services we need. We should not have to roam through the object graph of the system, hunting for the method we want to call. Rather we should simply be able to say:
```java
myCollaborator.doSomething().
```
JAVA
J1: Avoid Long Import Lists by Using Wildcards
If you use two or more classes from a package, then import the whole package with
```java
import package.*;
```
Long lists of imports are daunting to the reader. We don’t want to clutter up the tops of our modules with 80 lines of imports. Rather we want the imports to be a concise statement about which packages we collaborate with.
Specific imports are hard dependencies, whereas wildcard imports are not. If you specifically import a class, then that class must exist. But if you import a package with a wildcard, no particular classes need to exist. The import statement simply adds the package to the search path when hunting for names. So no true dependency is created by such imports, and they therefore serve to keep our modules less coupled.
There are times when the long list of specific imports can be useful. For example, if you are dealing with legacy code and you want to find out what classes you need to build mocks and stubs for, you can walk down the list of specific imports to find out the true qualified names of all those classes and then put the appropriate stubs in place. However, this use for specific imports is very rare. Furthermore, most modern IDEs will allow you to convert the wildcarded imports to a list of specific imports with a single command. So even in the legacy case it’s better to import wildcards.
Wildcard imports can sometimes cause name conflicts and ambiguities. Two classes with the same name, but in different packages, will need to be specifically imported, or at least specifically qualified when used. This can be a nuisance but is rare enough that using wildcard imports is still generally better than specific imports.
J2: Don’t Inherit Constants
I have seen this several times and it always makes me grimace. A programmer puts some constants in an interface and then gains access to those constants by inheriting that interface. Take a look at the following code:
```java
public class HourlyEmployee extends Employee {
private int tenthsWorked;
private double hourlyRate;
public Money calculatePay() {
int straightTime = Math.min(tenthsWorked, TENTHS_PER_WEEK);
int overTime = tenthsWorked - straightTime;
return new Money(
hourlyRate * (tenthsWorked + OVERTIME_RATE * overTime)
);
}
…
}
```
Where did the constants TENTHS_PER_WEEK and OVERTIME_RATE come from? They might have come from class Employee; so let’s take a look at that:
```java
public abstract class Employee implements PayrollConstants {
public abstract boolean isPayday();
public abstract Money calculatePay();
public abstract void deliverPay(Money pay);
}
```
Nope, not there. But then where? Look closely at class Employee. It implements PayrollConstants.
```java
public interface PayrollConstants {
public static final int TENTHS_PER_WEEK = 400;
public static final double OVERTIME_RATE = 1.5;
}
```
This is a hideous practice! The constants are hidden at the top of the inheritance hierarchy. Ick! Don’t use inheritance as a way to cheat the scoping rules of the language. Use a static import instead.
```java
import static PayrollConstants.*;
public class HourlyEmployee extends Employee {
private int tenthsWorked;
private double hourlyRate;
public Money calculatePay() {
int straightTime = Math.min(tenthsWorked, TENTHS_PER_WEEK);
int overTime = tenthsWorked - straightTime;
return new Money(
hourlyRate * (tenthsWorked + OVERTIME_RATE * overTime)
);
}
…
}
```
J3: Constants versus Enums
Now that enums have been added to the language (Java 5), use them! Don’t keep using the old trick of public static final ints. The meaning of ints can get lost. The meaning of enums cannot, because they belong to an enumeration that is named.
What’s more, study the syntax for enums carefully. They can have methods and fields. This makes them very powerful tools that allow much more expression and flexibility than ints. Consider this variation on the payroll code:
```java
public class HourlyEmployee extends Employee {
private int tenthsWorked;
HourlyPayGrade grade;
public Money calculatePay() {
int straightTime = Math.min(tenthsWorked, TENTHS_PER_WEEK);
int overTime = tenthsWorked - straightTime;
return new Money(
grade.rate() * (tenthsWorked + OVERTIME_RATE * overTime)
);
}
…
}
public enum HourlyPayGrade {
APPRENTICE {
public double rate() {
return 1.0;
}
},
LEUTENANT_JOURNEYMAN {
public double rate() {
return 1.2;
}
},
JOURNEYMAN {
public double rate() {
return 1.5;
}
},
MASTER {
public double rate() {
return 2.0;
}
};
public abstract double rate();
}
```
NAMES
N1: Choose Descriptive Names
Don’t be too quick to choose a name. Make sure the name is descriptive. Remember that meanings tend to drift as software evolves, so frequently reevaluate the appropriateness of the names you choose.
This is not just a “feel-good” recommendation. Names in software are 90 percent of what make software readable. You need to take the time to choose them wisely and keep them relevant. Names are too important to treat carelessly.
Consider the code below. What does it do? If I show you the code with well-chosen names, it will make perfect sense to you, but like this it’s just a hodge-podge of symbols and magic numbers.
```java
public int x() {
int q = 0;
int z = 0;
for (int kk = 0; kk < 10; kk++) {
if (l[z] == 10)
{
q += 10 + (l[z + 1] + l[z + 2]);
z += 1;
}
else if (l[z] + l[z + 1] == 10)
{
q += 10 + l[z + 2];
z += 2;
} else {
q += l[z] + l[z + 1];
z += 2;
}
}
return q;
}
```
Here is the code the way it should be written. This snippet is actually less complete than the one above. Yet you can infer immediately what it is trying to do, and you could very likely write the missing functions based on that inferred meaning. The magic numbers are no longer magic, and the structure of the algorithm is compellingly descriptive.
```java
public int score() {
int score = 0;
int frame = 0;
for (int frameNumber = 0; frameNumber < 10; frameNumber++) {
if (isStrike(frame)) {
score += 10 + nextTwoBallsForStrike(frame);
frame += 1;
} else if (isSpare(frame)) {
score += 10 + nextBallForSpare(frame);
frame += 2;
} else {
score += twoBallsInFrame(frame);
frame += 2;
}
}
return score;
}
```
The power of carefully chosen names is that they overload the structure of the code with description. That overloading sets the readers’ expectations about what the other functions in the module do. You can infer the implementation of isStrike() by looking at the code above. When you read the isStrike method, it will be “pretty much what you expected.”13
13. See Ward Cunningham’s quote on page 11.
```java
private boolean isStrike(int frame) {
return rolls[frame] == 10;
}
```
N2: Choose Names at the Appropriate Level of Abstraction
Don’t pick names that communicate implementation; choose names the reflect the level of abstraction of the class or function you are working in. This is hard to do. Again, people are just too good at mixing levels of abstractions. Each time you make a pass over your code, you will likely find some variable that is named at too low a level. You should take the opportunity to change those names when you find them. Making code readable requires a dedication to continuous improvement. Consider the Modem interface below:
```java
public interface Modem {
boolean dial(String phoneNumber);
boolean disconnect();
boolean send(char c);
char recv();
String getConnectedPhoneNumber();
}
```
At first this looks fine. The functions all seem appropriate. Indeed, for many applications they are. But now consider an application in which some modems aren’t connected by dialling. Rather they are connected permanently by hard wiring them together (think of the cable modems that provide Internet access to most homes nowadays). Perhaps some are connected by sending a port number to a switch over a USB connection. Clearly the notion of phone numbers is at the wrong level of abstraction. A better naming strategy for this scenario might be:
```java
public interface Modem {
boolean connect(String connectionLocator);
boolean disconnect();
boolean send(char c);
char recv();
String getConnectedLocator();
}
```
Now the names don’t make any commitments about phone numbers. They can still be used for phone numbers, or they could be used for any other kind of connection strategy.
N3: Use Standard Nomenclature Where Possible
Names are easier to understand if they are based on existing convention or usage. For example, if you are using the DECORATOR pattern, you should use the word Decorator in the names of the decorating classes. For example, AutoHangupModemDecorator might be the name of a class that decorates a Modem with the ability to automatically hang up at the end of a session.
Patterns are just one kind of standard. In Java, for example, functions that convert objects to string representations are often named toString. It is better to follow conventions like these than to invent your own.
Teams will often invent their own standard system of names for a particular project. Eric Evans refers to this as a ubiquitous language for the project.14 Your code should use the terms from this language extensively. In short, the more you can use names that are overloaded with special meanings that are relevant to your project, the easier it will be for readers to know what your code is talking about.
14. [DDD].
N4: Unambiguous Names
Choose names that make the workings of a function or variable unambiguous. Consider this example from FitNesse:
```java
private String doRename() throws Exception
{
if(refactorReferences)
renameReferences();
renamePage();
pathToRename.removeNameFromEnd();
pathToRename.addNameToEnd(newName);
return PathParser.render(pathToRename);
}
```
The name of this function does not say what the function does except in broad and vague terms. This is emphasized by the fact that there is a function named renamePage inside the function named doRename! What do the names tell you about the difference between the two functions? Nothing.
A better name for that function is renamePageAndOptionallyAllReferences. This may seem long, and it is, but it’s only called from one place in the module, so it’s explanatory value outweighs the length.
N5: Use Long Names for Long Scopes
The length of a name should be related to the length of the scope. You can use very short variable names for tiny scopes, but for big scopes you should use longer names.
Variable names like i and j are just fine if their scope is five lines long. Consider this snippet from the old standard “Bowling Game”:
```java
private void rollMany(int n, int pins)
{
for (int i=0; i 软件中随处可见命名。我们给变量、函数、参数、类和封包命名。我们给源代码及源代码所在目录命名。我们给 jar 文件、war 文件和 ear 文件命名。我们命名、命名,不断命名。既然有这么多命名要做,不妨做好它。下文列出了取个好名字的几条简单规则。
## 2.2 USE INTENTION-REVEALING NAMES 名副其实
It is easy to say that names should reveal intent. What we want to impress upon you is that we are serious about this. Choosing good names takes time but saves more than it takes. So take care with your names and change them when you find better ones. Everyone who reads your code (including you) will be happier if you do.
> 名副其实说起来简单。我们想要强调,这事很严肃。选个好名字要花时间,但省下来的时间比花掉的多。注意命名,而且一旦发现有更好的名称,就换掉旧的。这么做,读你代码的人(包括你自己)都会更开心。
The name of a variable, function, or class, should answer all the big questions. It should tell you why it exists, what it does, and how it is used. If a name requires a comment, then the name does not reveal its intent.
> 变量、函数或类的名称应该已经答复了所有的大问题。它该告诉你,它为什么会存在,它做什么事,应该怎么用。如果名称需要注释来补充,那就不算是名副其实。
```java
int d; // elapsed time in days
```
The name d reveals nothing. It does not evoke a sense of elapsed time, nor of days. We should choose a name that specifies what is being measured and the unit of that measurement:
> 名称 d 什么也没说明。它没有引起对时间消逝的感觉,更别说以日计了。我们应该选择指明了计量对象和计量单位的名称:
```java
int elapsedTimeInDays;
int daysSinceCreation;
int daysSinceModification;
int fileAgeInDays;
```
Choosing names that reveal intent can make it much easier to understand and change code. What is the purpose of this code?
> 选择体现本意的名称能让人更容易理解和修改代码。下列代码的目的何在?
```java
public List getThem() {
List list1 = new ArrayList();
for (int[] x : theList)
if (x[0] == 4)
list1.add(x);
return list1;
}
```
Why is it hard to tell what this code is doing? There are no complex expressions. Spacing and indentation are reasonable. There are only three variables and two constants mentioned. There aren’t even any fancy classes or polymorphic methods, just a list of arrays (or so it seems).
> 为什么难以说明上列代码要做什么事?里面并没有复杂的表达式。空格和缩进中规中矩。只用到三个变量和两个常量。甚至没有涉及任何其他类或多态方法,只是(或者看起来是)一个数组的列表而已。
The problem isn’t the simplicity of the code but the implicity of the code (to coin a phrase): the degree to which the context is not explicit in the code itself. The code implicitly requires that we know the answers to questions such as:
> 问题不在于代码的简洁度,而是在于代码的模糊度:即上下文在代码中未被明确体现的程度。上列代码要求我们了解类似以下问题的答案:
1. What kinds of things are in theList?
2. What is the significance of the zeroth subscript of an item in theList?
3. What is the significance of the value 4?
4. How would I use the list being returned?
---
> 1. theList 中是什么类型的东西?
> 2. theList 零下标条目的意义是什么?
> 3. 值 4 的意义是什么?
> 4. 我怎么使用返回的列表?
The answers to these questions are not present in the code sample, but they could have been. Say that we’re working in a mine sweeper game. We find that the board is a list of cells called theList. Let’s rename that to gameBoard.
> 问题的答案没体现在代码段中,可那就是它们该在的地方。比方说,我们在开发一种扫雷游戏,我们发现,盘面是名为 theList 的单元格列表,那就将其名称改为 gameBoard。
Each cell on the board is represented by a simple array. We further find that the zeroth subscript is the location of a status value and that a status value of 4 means “flagged.” Just by giving these concepts names we can improve the code considerably:
> 盘面上每个单元格都用一个简单数组表示。我们还发现,零下标条目是一种状态值,而该种状态值为 4 表示“已标记”。只要改为有意义的名称,代码就会得到相当程度的改进:
```java
public List getFlaggedCells() {
List flaggedCells = new ArrayList();
for (int[] cell : gameBoard)
if (cell[STATUS_VALUE] == FLAGGED)
flaggedCells.add(cell);
return flaggedCells;
}
```
Notice that the simplicity of the code has not changed. It still has exactly the same number of operators and constants, with exactly the same number of nesting levels. But the code has become much more explicit.
> 注意,代码的简洁性并未被触及。运算符和常量的数量全然保持不变,嵌套数量也全然保持不变。但代码变得明确多了。
We can go further and write a simple class for cells instead of using an array of ints. It can include an intention-revealing function (call it isFlagged) to hide the magic numbers. It results in a new version of the function:
> 还可以更进一步,不用 int 数组表示单元格,而是另写一个类。该类包括一个名副其实的函数(称为 isFlagged),从而掩盖住那个魔术数[1]。于是得到函数的新版本:
```java
public List getFlaggedCells() {
List flaggedCells = new ArrayList();
for (Cell cell : gameBoard)
if (cell.isFlagged())
flaggedCells.add(cell);
return flaggedCells;
}
```
With these simple name changes, it’s not difficult to understand what’s going on. This is the power of choosing good names.
> 只要简单改一下名称,就能轻易知道发生了什么。这就是选用好名称的力量。
## 2.3 AVOID DISINFORMATION 避免误导
Programmers must avoid leaving false clues that obscure the meaning of code. We should avoid words whose entrenched meanings vary from our intended meaning. For example, hp, aix, and sco would be poor variable names because they are the names of Unix platforms or variants. Even if you are coding a hypotenuse and hp looks like a good abbreviation, it could be disinformative.
> 程序员必须避免留下掩藏代码本意的错误线索。应当避免使用与本意相悖的词。例如,hp、aix 和 sco 都不该用做变量名,因为它们都是 UNIX 平台或类 UNIX 平台的专有名称。即便你是在编写三角计算程序, hp 看起来是个不错的缩写[2],但那也可能会提供错误信息。
Do not refer to a grouping of accounts as an accountList unless it’s actually a List. The word list means something specific to programmers. If the container holding the accounts is not actually a List, it may lead to false conclusions.1 So accountGroup or bunchOfAccounts or just plain accounts would be better.
> 别用 accountList 来指称一组账号,除非它真的是 List 类型。List 一词对程序员有特殊意义。如果包纳账号的容器并非真是个 List,就会引起错误的判断[3]。所以,用 accountGroup 或 bunchOfAccounts,甚至直接用 accounts 都会好一些。
1. As we’ll see later on, even if the container is a List, it’s probably better not to encode the container type into the name.
Beware of using names which vary in small ways. How long does it take to spot the subtle difference between a XYZControllerForEfficientHandlingOfStrings in one module and, somewhere a little more distant, XYZControllerForEfficientStorageOfStrings? The words have frightfully similar shapes.
> 提防使用不同之处较小的名称。想区分模块中某处的 XYZControllerFor EfficientHandlingOfStrings 和另一处的 XYZControllerForEfficientStorageOfStrings,会花多长时间呢?这两个词外形实在太相似了。
Spelling similar concepts similarly is information. Using inconsistent spellings is disinformation. With modern Java environments we enjoy automatic code completion. We write a few characters of a name and press some hotkey combination (if that) and are rewarded with a list of possible completions for that name. It is very helpful if names for very similar things sort together alphabetically and if the differences are very obvious, because the developer is likely to pick an object by name without seeing your copious comments or even the list of methods supplied by that class.
> 以同样的方式拼写出同样的概念才是信息。拼写前后不一致就是误导。我们很享受现代 Java 编程环境的自动代码完成特性。键入某个名称的前几个字母,按一下某个热键组合(如果有的话),就能得到一列该名称的可能形式。假如相似的名称依字母顺序放在一起,且差异很明显,那就会相当有助益,因为程序员多半会压根不看你的详细注释,甚至不看该类的方法列表就直接看名字挑一个对象。
A truly awful example of disinformative names would be the use of lower-case L or uppercase O as variable names, especially in combination. The problem, of course, is that they look almost entirely like the constants one and zero, respectively.
> 误导性名称真正可怕的例子,是用小写字母 l 和大写字母 O 作为变量名,尤其是在组合使用的时候。当然,问题在于它们看起来完全像是常量“壹”和“零”。
```java
int a = l;
if ( O == l )
a = O1;
else
l = 01;
```
The reader may think this a contrivance, but we have examined code where such things were abundant. In one case the author of the code suggested using a different font so that the differences were more obvious, a solution that would have to be passed down to all future developers as oral tradition or in a written document. The problem is conquered with finality and without creating new work products by a simple renaming.
> 读者可能会认为这纯属虚构,但我们确曾见过充斥这类玩意的代码。有一次,代码作者建议用不同字体写变量名,好显得更清楚些,不过这种方案得要通过口头和书面传递给未来所有的开发者才行。后来,只是做了简单的重命名操作,就解决了问题,而且也没搞出别的事。
## 2.4 MAKE MEANINGFUL DISTINCTIONS 做有意义的区分

Programmers create problems for themselves when they write code solely to satisfy a compiler or interpreter. For example, because you can’t use the same name to refer to two different things in the same scope, you might be tempted to change one name in an arbitrary way. Sometimes this is done by misspelling one, leading to the surprising situation where correcting spelling errors leads to an inability to compile.2
> 如果程序员只是为满足编译器或解释器的需要而写代码,就会制造麻烦。例如,因为同一作用范围内两样不同的东西不能重名,你可能会随手改掉其中一个的名称。有时干脆以错误的拼写充数,结果就是出现在更正拼写错误后导致编译器出错的情况。
2. Consider, for example, the truly hideous practice of creating a variable named klass just because the name class was used for something else.
It is not sufficient to add number series or noise words, even though the compiler is satisfied. If names must be different, then they should also mean something different.
> 光是添加数字系列或是废话远远不够,即便这足以让编译器满意。如果名称必须相异,那其意思也应该不同才对。
Number-series naming (a1, a2, .. aN) is the opposite of intentional naming. Such names are not disinformative—they are noninformative; they provide no clue to the author’s intention. Consider:
> 以数字系列命名(a1、a2,……aN)是依义命名的对立面。这样的名称纯属误导——完全没有提供正确信息;没有提供导向作者意图的线索。试看:
```java
public static void copyChars(char a1[], char a2[]) {
for (int i = 0; i < a1.length; i++) {
a2[i] = a1[i];
}
}
```
This function reads much better when source and destination are used for the argument names.
> 如果参数名改为 source 和 destination,这个函数就会像样许多。
Noise words are another meaningless distinction. Imagine that you have a Product class. If you have another called ProductInfo or ProductData, you have made the names different without making them mean anything different. Info and Data are indistinct noise words like a, an, and the.
> 废话是另一种没意义的区分。假设你有一个 Product 类。如果还有一个 ProductInfo 或 ProductData 类,那它们的名称虽然不同,意思却无区别。Info 和 Data 就像 a、an 和 the 一样,是意义含混的废话。
Note that there is nothing wrong with using prefix conventions like a and the so long as they make a meaningful distinction. For example you might use a for all local variables and the for all function arguments.3 The problem comes in when you decide to call a variable theZork because you already have another variable named zork.
> 注意,只要体现出有意义的区分,使用 a 和 the 这样的前缀就没错。例如,你可能把 a 用在域内变量,而把 the 用于函数参数[5]。但如果你已经有一个名为 zork 的变量,又想调用一个名为 theZork 的变量,麻烦就来了。
3. Uncle Bob used to do this in C++ but has given up the practice because modern IDEs make it unnecessary.
Noise words are redundant. The word variable should never appear in a variable name. The word table should never appear in a table name. How is NameString better than Name? Would a Name ever be a floating point number? If so, it breaks an earlier rule about disinformation. Imagine finding one class named Customer and another named CustomerObject. What should you understand as the distinction? Which one will represent the best path to a customer’s payment history?
> 废话都是冗余。Variable 一词永远不应当出现在变量名中。Table 一词永远不应当出现在表名中。NameString 会比 Name 好吗?难道 Name 会是一个浮点数不成?如果是这样,就触犯了关于误导的规则。设想有个名为 Customer 的类,还有一个名为 CustomerObject 的类。区别何在呢?哪一个是表示客户历史支付情况的最佳途径?
There is an application we know of where this is illustrated. we’ve changed the names to protect the guilty, but here’s the exact form of the error:
> 有个应用反映了这种状况。为当事者讳,我们改了一下,不过犯错的代码的确就是这个样子:
```java
getActiveAccount();
getActiveAccounts();
getActiveAccountInfo();
```
How are the programmers in this project supposed to know which of these functions to call?
> 程序员怎么能知道该调用哪个函数呢?
In the absence of specific conventions, the variable moneyAmount is indistinguishable from money, customerInfo is indistinguishable from customer, accountData is indistinguishable from account, and theMessage is indistinguishable from message. Distinguish names in such a way that the reader knows what the differences offer.
> 如果缺少明确约定,变量 moneyAmount 就与 money 没区别, customerInfo 与 customer 没区别,accountData 与 account 没区别, theMessage 也与 message 没区别。要区分名称,就要以读者能鉴别不同之处的方式来区分。
## 2.5 USE PRONOUNCEABLE NAMES 使用读得出来的名称
Humans are good at words. A significant part of our brains is dedicated to the concept of words. And words are, by definition, pronounceable. It would be a shame not to take advantage of that huge portion of our brains that has evolved to deal with spoken language. So make your names pronounceable.
> 人类长于记忆和使用单词。大脑的相当一部分就是用来容纳和处理单词的。单词能读得出来。人类进化到大脑中有那么大的一块地方用来处理言语,若不善加利用,实在是种耻辱。
If you can’t pronounce it, you can’t discuss it without sounding like an idiot. “Well, over here on the bee cee arr three cee enn tee we have a pee ess zee kyew int, see?” This matters because programming is a social activity.
> 如果名称读不出来,讨论的时候就会像个傻鸟。“哎,这儿,鼻涕阿三喜摁踢(bee cee arr three cee enn tee)[6]上头,有个皮挨死极翘(pee ess zee kyew)[7]整数,看见没?”这不是小事,因为编程本就是一种社会活动。
A company I know has genymdhms (generation date, year, month, day, hour, minute, and second) so they walked around saying “gen why emm dee aich emm ess”. I have an annoying habit of pronouncing everything as written, so I started saying “gen-yah-muddahims.” It later was being called this by a host of designers and analysts, and we still sounded silly. But we were in on the joke, so it was fun. Fun or not, we were tolerating poor naming. New developers had to have the variables explained to them, and then they spoke about it in silly made-up words instead of using proper English terms. Compare
> 有家公司,程序里面写了个 genymdhms(生成日期,年、月、日、时、分、秒),他们一般读作“gen why emm dee aich emm ess”[8]。我有个见字照读的恶习,于是开口就念“gen-yah-mudda-hims”。后来好些设计师和分析师都有样学样,听起来傻乎乎的。我们知道典故,所以会觉得很搞笑。搞笑归搞笑,实际是在强忍糟糕的命名。在给新开发者解释变量的意义时,他们总是读出傻乎乎的自造词,而非恰当的英语词。比较
```java
class DtaRcrd102 {
private Date genymdhms;
private Date modymdhms;
private final String pszqint = ”102”;
/* … */
};
```
to
> 和
```java
class Customer {
private Date generationTimestamp;
private Date modificationTimestamp;;
private final String recordId = ”102”;
/* … */
};
```
Intelligent conversation is now possible: “Hey, Mikey, take a look at this record! The generation timestamp is set to tomorrow’s date! How can that be?”
现在读起来就像人话了:“喂,Mikey,看看这条记录!生成时间戳(generation timestamp) [9]被设置为明天了!不能这样吧?”
## 2.6 USE SEARCHABLE NAMES 使用可搜索的名称
Single-letter names and numeric constants have a particular problem in that they are not easy to locate across a body of text.
> 单字母名称和数字常量有个问题,就是很难在一大篇文字中找出来。
One might easily grep for MAX_CLASSES_PER_STUDENT, but the number 7 could be more troublesome. Searches may turn up the digit as part of file names, other constant definitions, and in various expressions where the value is used with different intent. It is even worse when a constant is a long number and someone might have transposed digits, thereby creating a bug while simultaneously evading the programmer’s search.
> 找 MAX_CLASSES_PER_STUDENT 很容易,但想找数字 7 就麻烦了,它可能是某些文件名或其他常量定义的一部分,出现在因不同意图而采用的各种表达式中。如果该常量是个长数字,又被人错改过,就会逃过搜索,从而造成错误。
Likewise, the name e is a poor choice for any variable for which a programmer might need to search. It is the most common letter in the English language and likely to show up in every passage of text in every program. In this regard, longer names trump shorter names, and any searchable name trumps a constant in code.
> 同样,e 也不是个便于搜索的好变量名。它是英文中最常用的字母,在每个程序、每段代码中都有可能出现。由此而见,长名称胜于短名称,搜得到的名称胜于用自造编码代写就的名称。
My personal preference is that single-letter names can ONLY be used as local variables inside short methods. The length of a name should correspond to the size of its scope [N5]. If a variable or constant might be seen or used in multiple places in a body of code, it is imperative to give it a search-friendly name. Once again compare
> 窃以为单字母名称仅用于短方法中的本地变量。名称长短应与其作用域大小相对应[N5]。若变量或常量可能在代码中多处使用,则应赋其以便于搜索的名称。再比较
```java
for (int j=0; j<34; j++) {
s += (t[j]*4)/5;
}
```
to
> 和
```java
int realDaysPerIdealDay = 4;
const int WORK_DAYS_PER_WEEK = 5;
int sum = 0;
for (int j=0; j < NUMBER_OF_TASKS; j++) {
int realTaskDays = taskEstimate[j] * realDaysPerIdealDay;
int realTaskWeeks = (realdays / WORK_DAYS_PER_WEEK);
sum += realTaskWeeks;
}
```
Note that sum, above, is not a particularly useful name but at least is searchable. The intentionally named code makes for a longer function, but consider how much easier it will be to find WORK_DAYS_PER_WEEK than to find all the places where 5 was used and filter the list down to just the instances with the intended meaning.
> 注意,上面代码中的 sum 并非特别有用的名称,不过它至少搜得到。采用能表达意图的名称,貌似拉长了函数代码,但要想想看,WORK_DAYS_PER_WEEK 要比数字 5 好找得多,而列表中也只剩下了体现作者意图的名称。
## 2.7 AVOID ENCODINGS 避免使用编码
We have enough encodings to deal with without adding more to our burden. Encoding type or scope information into names simply adds an extra burden of deciphering. It hardly seems reasonable to require each new employee to learn yet another encoding “language” in addition to learning the (usually considerable) body of code that they’ll be working in. It is an unnecessary mental burden when trying to solve a problem. Encoded names are seldom pronounceable and are easy to mis-type.
> 编码已经太多,无谓再自找麻烦。把类型或作用域编进名称里面,徒然增加了解码的负担。没理由要求每位新人都在弄清要应付的代码之外(那算是正常的),还要再搞懂另一种编码“语言”。这对于解决问题而言,纯属多余的负担。带编码的名称通常也不便发音,容易打错。
### 2.7.1 Hungarian Notation 匈牙利语标记法
In days of old, when we worked in name-length-challenged languages, we violated this rule out of necessity, and with regret. Fortran forced encodings by making the first letter a code for the type. Early versions of BASIC allowed only a letter plus one digit. Hungarian Notation (HN) took this to a whole new level.
> 在往昔名称长短很要命的时代,我们毫无必要地破坏了不编码的规矩,如今后悔不迭。Fortran 语言要求首字母体现出类型,导致了编码的产生。BASIC 早期版本只允许使用一个字母再加上一位数字。匈牙利语标记法(Hungarian Notation,HN)将这种态势愈演愈烈。
HN was considered to be pretty important back in the Windows C API, when everything was an integer handle or a long pointer or a void pointer, or one of several implementations of “string” (with different uses and attributes). The compiler did not check types in those days, so the programmers needed a crutch to help them remember the types.
> 在 Windows 的 C 语言 API 的时代,HN 相当重要,那时所有名称要么是个整数句柄,要么是个长指针或者 void 指针,要不然就是 string 的几种实现(有不同的用途和属性)之一。那时候编译器并不做类型检查,程序员需要匈牙利语标记法来帮助自己记住类型。
In modern languages we have much richer type systems, and the compilers remember and enforce the types. What’s more, there is a trend toward smaller classes and shorter functions so that people can usually see the point of declaration of each variable they’re using.
> 现代编程语言具有更丰富的类型系统,编译器也记得并强制使用类型。而且,人们趋向于使用更小的类、更短的方法,好让每个变量的定义都在视野范围之内。
Java programmers don’t need type encoding. Objects are strongly typed, and editing environments have advanced such that they detect a type error long before you can run a compile! So nowadays HN and other forms of type encoding are simply impediments. They make it harder to change the name or type of a variable, function, or class. They make it harder to read the code. And they create the possibility that the encoding system will mislead the reader.
> Java 程序员不需要类型编码。对象是强类型的,代码编辑环境已经先进到在编译开始前就侦测到类型错误的程度!所以,如今 HN 和其他类型编码形式都纯属多余。它们增加了修改变量、函数或类的名称或类型的难度。它们增加了阅读代码的难度。它们制造了让编码系统误导读者的可能性。
```java
PhoneNumber phoneString;
// name not changed when type changed!
```
### 2.7.2 Member Prefixes 成员前缀
You also don’t need to prefix member variables with m\_ anymore. Your classes and functions should be small enough that you don’t need them. And you should be using an editing environment that highlights or colorizes members to make them distinct.
> 也不必用 m\_前缀来标明成员变量。应当把类和函数做得足够小,消除对成员前缀的需要。你应当使用某种可以高亮或用颜色标出成员的编辑环境。
```java
public class Part {
private String m_dsc; // The textual description
void setName(String name) {
m_dsc = name;
}
}
_________________________________________________
public class Part {
String description;
void setDescription(String description) {
this.description = description;
}
}
```
Besides, people quickly learn to ignore the prefix (or suffix) to see the meaningful part of the name. The more we read the code, the less we see the prefixes. Eventually the prefixes become unseen clutter and a marker of older code.
> 此外,人们会很快学会无视前缀(或后缀),只看到名称中有意义的部分。代码读得越多,眼中就越没有前缀。最终,前缀变作了不入法眼的废料,变作了旧代码的标志物。
### 2.7.3 Interfaces and Implementations 接口和实现
These are sometimes a special case for encodings. For example, say you are building an ABSTRACT FACTORY for the creation of shapes. This factory will be an interface and will be implemented by a concrete class. What should you name them? IShapeFactory and ShapeFactory? I prefer to leave interfaces unadorned. The preceding I, so common in today’s legacy wads, is a distraction at best and too much information at worst. I don’t want my users knowing that I’m handing them an interface. I just want them to know that it’s a ShapeFactory. So if I must encode either the interface or the implementation, I choose the implementation. Calling it ShapeFactoryImp, or even the hideous CShapeFactory, is preferable to encoding the interface.
> 有时也会出现采用编码的特殊情形。比如,你在做一个创建形状用的抽象工厂(Abstract Factory)。该工厂是个接口,要用具体类来实现。你怎么来命名工厂和具体类呢?IShapeFactory 和 ShapeFactory 吗?我喜欢不加修饰的接口。前导字母 I 被滥用到了说好听点是干扰,说难听点根本就是废话的程度。我不想让用户知道我给他们的是接口。我就想让他们知道那是个 ShapeFactory。如果接口和实现必须选一个来编码的话,我宁肯选择实现。ShapeFactoryImp,甚至是丑陋的 CShapeFactory,都比对接口名称编码来得好。
## 2.8 AVOID MENTAL MAPPING 避免思维映射
Readers shouldn’t have to mentally translate your names into other names they already know. This problem generally arises from a choice to use neither problem domain terms nor solution domain terms.
> 不应当让读者在脑中把你的名称翻译为他们熟知的名称。这种问题经常出现在选择是使用问题领域术语还是解决方案领域术语时。
This is a problem with single-letter variable names. Certainly a loop counter may be named i or j or k (though never l!) if its scope is very small and no other names can conflict with it. This is because those single-letter names for loop counters are traditional. However, in most other contexts a single-letter name is a poor choice; it’s just a place holder that the reader must mentally map to the actual concept. There can be no worse reason for using the name c than because a and b were already taken.
> 单字母变量名就是个问题。在作用域较小、也没有名称冲突时,循环计数器自然有可能被命名为 i 或 j 或 k。(但千万别用字母 l!)这是因为传统上惯用单字母名称做循环计数器。然而,在多数其他情况下,单字母名称不是个好选择;读者必须在脑中将它映射为真实概念。仅仅是因为有了 a 和 b,就要取名为 c,实在并非像样的理由。
In general programmers are pretty smart people. Smart people sometimes like to show off their smarts by demonstrating their mental juggling abilities. After all, if you can reliably remember that r is the lower-cased version of the url with the host and scheme removed, then you must clearly be very smart.
> 程序员通常都是聪明人。聪明人有时会借脑筋急转弯炫耀其聪明。总而言之,假使你记得 r 代表不包含主机名和图式(scheme)的小写字母版 url 的话,那你真是太聪明了。
One difference between a smart programmer and a professional programmer is that the professional understands that clarity is king. Professionals use their powers for good and write code that others can understand.
> 聪明程序员和专业程序员之间的区别在于,专业程序员了解,明确是王道。专业程序员善用其能,编写其他人能理解的代码。
## 2.9 CLASS NAMES 类名
Classes and objects should have noun or noun phrase names like Customer, WikiPage, Account, and AddressParser. Avoid words like Manager, Processor, Data, or Info in the name of a class. A class name should not be a verb.
> 类名和对象名应该是名词或名词短语,如 Customer、WikiPage、Account 和 AddressParser。避免使用 Manager、Processor、Data 或 Info 这样的类名。类名不应当是动词。
## 2.10 METHOD NAMES 方法名
Methods should have verb or verb phrase names like postPayment, deletePage, or save. Accessors, mutators, and predicates should be named for their value and prefixed with get, set, and is according to the javabean standard.4
> 方法名应当是动词或动词短语,如 postPayment、deletePage 或 save。属性访问器、修改器和断言应该根据其值命名,并依 Javabean 标准[10]加上 get、set 和 is 前缀。
4. http://java.sun.com/products/javabeans/docs/spec.html
```java
string name = employee.getName();
customer.setName(”mike”);
if (paycheck.isPosted())…
```
When constructors are overloaded, use static factory methods with names that describe the arguments. For example,
> 重载构造器时,使用描述了参数的静态工厂方法名。例如,
```java
Complex fulcrumPoint = Complex.FromRealNumber(23.0);
```
is generally better than
> 通常好于
```java
Complex fulcrumPoint = new Complex(23.0);
```
Consider enforcing their use by making the corresponding constructors private.
> 可以考虑将相应的构造器设置为 private,强制使用这种命名手段。
## 2.11 ON’T BE CUTE 别扮可爱
If names are too clever, they will be memorable only to people who share the author’s sense of humor, and only as long as these people remember the joke. Will they know what the function named HolyHandGrenade is supposed to do? Sure, it’s cute, but maybe in this case DeleteItems might be a better name. Choose clarity over entertainment value.
> 如果名称太耍宝,那就只有同作者一般有幽默感的人才能记得住,而且还是在他们记得那个笑话的时候才行。谁会知道名为 HolyHandGrenade[11]的函数是用来做什么的呢?没错,这名字挺伶俐,不过 DeleteItems[12]或许是更好的名称。宁可明确,毋为好玩。

Cuteness in code often appears in the form of colloquialisms or slang. For example, don’t use the name whack() to mean kill(). Don’t tell little culture-dependent jokes like eatMyShorts() to mean abort().
> 扮可爱的做法在代码中经常体现为使用俗话或俚语。例如,别用 whack( )[13]来表示 kill( )。别用 eatMyShorts( )[14]这类与文化紧密相关的笑话来表示 abort( )。
Say what you mean. Mean what you say.
> 言到意到。意到言到。
## 2.12 PICK ONE WORD PER CONCEPT 每个概念对应一个词
Pick one word for one abstract concept and stick with it. For instance, it’s confusing to have fetch, retrieve, and get as equivalent methods of different classes. How do you remember which method name goes with which class? Sadly, you often have to remember which company, group, or individual wrote the library or class in order to remember which term was used. Otherwise, you spend an awful lot of time browsing through headers and previous code samples.
> 给每个抽象概念选一个词,并且一以贯之。例如,使用 fetch、 retrieve 和 get 来给在多个类中的同种方法命名。你怎么记得住哪个类中是哪个方法呢?很悲哀,你总得记住编写库或类的公司、机构或个人,才能想得起来用的是哪个术语。否则,就得耗费大把时间浏览各个文件头及前面的代码。
Modern editing environments like Eclipse and IntelliJ provide context-sensitive clues, such as the list of methods you can call on a given object. But note that the list doesn’t usually give you the comments you wrote around your function names and parameter lists. You are lucky if it gives the parameter names from function declarations. The function names have to stand alone, and they have to be consistent in order for you to pick the correct method without any additional exploration.
> Eclipse 和 IntelliJ 之类现代编程环境提供了与环境相关的线索,比如某个对象能调用的方法列表。不过要注意,列表中通常不会给出你为函数名和参数列表编写的注释。如果参数名称来自函数声明,你就太幸运了。函数名称应当独一无二,而且要保持一致,这样你才能不借助多余的浏览就找到正确的方法。
Likewise, it’s confusing to have a controller and a manager and a driver in the same code base. What is the essential difference between a DeviceManager and a Protocol-Controller? Why are both not controllers or both not managers? Are they both Drivers really? The name leads you to expect two objects that have very different type as well as having different classes.
> 同样,在同一堆代码中有 controller,又有 manager,还有 driver,就会令人困惑。DeviceManager 和 Protocol-Controller 之间有何根本区别?为什么不全用 controllers 或 managers?他们都是 Drivers 吗?这种名称,让人觉得这两个对象是不同类型的,也分属不同的类。
A consistent lexicon is a great boon to the programmers who must use your code.
> 对于那些会用到你代码的程序员,一以贯之的命名法简直就是天降福音。
## 2.13 DON’T PUN 别用双关语
Avoid using the same word for two purposes. Using the same term for two different ideas is essentially a pun.
> 避免将同一单词用于不同目的。同一术语用于不同概念,基本上就是双关语了。
If you follow the “one word per concept” rule, you could end up with many classes that have, for example, an add method. As long as the parameter lists and return values of the various add methods are semantically equivalent, all is well.
> 如果遵循“一词一义”规则,可能在好多个类里面都会有 add 方法。只要这些 add 方法的参数列表和返回值在语义上等价,就一切顺利。
However one might decide to use the word add for “consistency” when he or she is not in fact adding in the same sense. Let’s say we have many classes where add will create a new value by adding or concatenating two existing values. Now let’s say we are writing a new class that has a method that puts its single parameter into a collection. Should we call this method add? It might seem consistent because we have so many other add methods, but in this case, the semantics are different, so we should use a name like insert or append instead. To call the new method add would be a pun.
> 但是,可能会有人决定为“保持一致”而使用 add 这个词来命名,即便并非真的想表示这种意思。比如,在多个类中都有 add 方法,该方法通过增加或连接两个现存值来获得新值。假设要写个新类,该类中有一个方法,把单个参数放到群集(collection)中。该把这个方法叫做 add 吗?这样做貌似和其他 add 方法保持了一致,但实际上语义却不同,应该用 insert 或 append 之类词来命名才对。把该方法命名为 add,就是双关语了。
Our goal, as authors, is to make our code as easy as possible to understand. We want our code to be a quick skim, not an intense study. We want to use the popular paperback model whereby the author is responsible for making himself clear and not the academic model where it is the scholar’s job to dig the meaning out of the paper.
> 代码作者应尽力写出易于理解的代码。我们想把代码写得让别人能一目尽览,而不必殚精竭虑地研究。我们想要那种大众化的作者尽责写清楚的平装书模式;我们不想要那种学者挖地三尺才能明白个中意义的学院派模式。
## 2.14 USE SOLUTION DOMAIN NAMES 使用解决方案领域名称
Remember that the people who read your code will be programmers. So go ahead and use computer science (CS) terms, algorithm names, pattern names, math terms, and so forth. It is not wise to draw every name from the problem domain because we don’t want our coworkers to have to run back and forth to the customer asking what every name means when they already know the concept by a different name.
> 记住,只有程序员才会读你的代码。所以,尽管用那些计算机科学(Computer Science,CS)术语、算法名、模式名、数学术语吧。依据问题所涉领域来命名可不算是聪明的做法,因为不该让协作者老是跑去问客户每个名称的含义,其实他们早该通过另一名称了解这个概念了。对于熟悉访问者(VISITOR)模式的程序来说,名称 AccountVisitor 富有意义。哪个程序员会不知道 JobQueue 的意思呢?程序员要做太多技术性工作。给这些事取个技术性的名称,通常是最靠谱的做法。
The name AccountVisitor means a great deal to a programmer who is familiar with the VISITOR pattern. What programmer would not know what a JobQueue was? There are lots of very technical things that programmers have to do. Choosing technical names for those things is usually the most appropriate course.
## 2.15 USE PROBLEM DOMAIN NAMES 使用源自所涉问题领域的名称
When there is no “programmer-eese” for what you’re doing, use the name from the problem domain. At least the programmer who maintains your code can ask a domain expert what it means.
> 如果不能用程序员熟悉的术语来给手头的工作命名,就采用从所涉问题领域而来的名称吧。至少,负责维护代码的程序员就能去请教领域专家了。
Separating solution and problem domain concepts is part of the job of a good programmer and designer. The code that has more to do with problem domain concepts should have names drawn from the problem domain.
> 优秀的程序员和设计师,其工作之一就是分离解决方案领域和问题领域的概念。与所涉问题领域更为贴近的代码,应当采用源自问题领域的名称。
## 2.16 ADD MEANINGFUL CONTEXT 添加有意义的语境
There are a few names which are meaningful in and of themselves—most are not. Instead, you need to place names in context for your reader by enclosing them in well-named classes, functions, or namespaces. When all else fails, then prefixing the name may be necessary as a last resort.
> 很少有名称是能自我说明的——多数都不能。反之,你需要用有良好命名的类、函数或名称空间来放置名称,给读者提供语境。如果没这么做,给名称添加前缀就是最后一招了。
Imagine that you have variables named firstName, lastName, street, houseNumber, city, state, and zipcode. Taken together it’s pretty clear that they form an address. But what if you just saw the state variable being used alone in a method? Would you automatically infer that it was part of an address?
> 设想你有名为 firstName、lastName、street、houseNumber、city、state 和 zipcode 的变量。当它们搁一块儿的时候,很明确是构成了一个地址。不过,假使只是在某个方法中看见孤零零一个 state 变量呢?你会理所当然推断那是某个地址的一部分吗?
You can add context by using prefixes: addrFirstName, addrLastName, addrState, and so on. At least readers will understand that these variables are part of a larger structure. Of course, a better solution is to create a class named Address. Then, even the compiler knows that the variables belong to a bigger concept.
> 可以添加前缀 addrFirstName、addrLastName、addrState 等,以此提供语境。至少,读者会明白这些变量是某个更大结构的一部分。当然,更好的方案是创建名为 Address 的类。这样,即便是编译器也会知道这些变量隶属某个更大的概念了。
Consider the method in Listing 2-1. Do the variables need a more meaningful context? The function name provides only part of the context; the algorithm provides the rest. Once you read through the function, you see that the three variables, number, verb, and pluralModifier, are part of the “guess statistics” message. Unfortunately, the context must be inferred. When you first look at the method, the meanings of the variables are opaque.
> 看看代码清单 2-1 中的方法。以下变量是否需要更有意义的语境呢?函数名仅给出了部分语境;算法提供了剩下的部分。遍览函数后,你会知道 number、verb 和 pluralModifier 这三个变量是“测估”信息的一部分。不幸的是这语境得靠读者推断出来。第一眼看到这个方法时,这些变量的含义完全不清楚。
Listing 2-1 Variables with unclear context.
> 代码清单 2-1 语境不明确的变量
```java
private void printGuessStatistics(char candidate, int count) {
String number;
String verb;
String pluralModifier;
if (count == 0) {
number = ”no”;
verb = ”are”;
pluralModifier = ”s”;
} else if (count == 1) {
number = ”1”;
verb = ”is”;
pluralModifier = ””;
} else {
number = Integer.toString(count);
verb = ”are”;
pluralModifier = ”s”;
}
String guessMessage = String.format(
”There %s %s %s%s”, verb, number, candidate, pluralModifier
);
print(guessMessage);
}
```
The function is a bit too long and the variables are used throughout. To split the function into smaller pieces we need to create a GuessStatisticsMessage class and make the three variables fields of this class. This provides a clear context for the three variables. They are definitively part of the GuessStatisticsMessage. The improvement of context also allows the algorithm to be made much cleaner by breaking it into many smaller functions. (See Listing 2-2.)
> 上列函数有点儿过长,变量的使用贯穿始终。要分解这个函数,需要创建一个名为 GuessStatisticsMessage 的类,把三个变量做成该类的成员字段。这样它们就在定义上变作了 GuessStatisticsMessage 的一部分。语境的增强也让算法能够通过分解为更小的函数而变得更为干净利落。
Listing 2-2 Variables have a context.
> (如代码清单 2-2 所示。)代码清单
```java
public class GuessStatisticsMessage {
private String number;
private String verb;
private String pluralModifier;
public String make(char candidate, int count) {
createPluralDependentMessageParts(count);
return String.format(
"There %s %s %s%s",
verb, number, candidate, pluralModifier );
}
private void createPluralDependentMessageParts(int count) {
if (count == 0) {
thereAreNoLetters();
} else if (count == 1) {
thereIsOneLetter();
} else {
thereAreManyLetters(count);
}
}
private void thereAreManyLetters(int count) {
number = Integer.toString(count);
verb = "are";
pluralModifier = "s";
}
private void thereIsOneLetter() {
number = "1";
verb = "is";
pluralModifier = "";
}
private void thereAreNoLetters() {
number = "no";
verb = "are";
pluralModifier = "s";
}
}
```
## 2.17 DON’T ADD GRATUITOUS CONTEXT 不要添加没用的语境
In an imaginary application called “Gas Station Deluxe,” it is a bad idea to prefix every class with GSD. Frankly, you are working against your tools. You type G and press the completion key and are rewarded with a mile-long list of every class in the system. Is that wise? Why make it hard for the IDE to help you?
> 设若有一个名为“加油站豪华版”(Gas Station Deluxe)的应用,在其中给每个类添加 GSD 前缀就不是什么好点子。说白了,你是在和自己在用的工具过不去。输入 G,按下自动完成键,结果会得到系统中全部类的列表,列表恨不得有一英里那么长。这样做聪明吗?为什么要搞得 IDE 没法帮助你?
Likewise, say you invented a MailingAddress class in GSD’s accounting module, and you named it GSDAccountAddress. Later, you need a mailing address for your customer contact application. Do you use GSDAccountAddress? Does it sound like the right name? Ten of 17 characters are redundant or irrelevant.
> 再比如,你在 GSD 应用程序中的记账模块创建了一个表示邮件地址的类,然后给该类命名为 GSDAccountAddress。稍后,你的客户联络应用中需要用到邮件地址,你会用 GSDAccountAddress 吗?这名字听起来没问题吗?在这 17 个字母里面,有 10 个字母纯属多余和与当前语境毫无关联。
Shorter names are generally better than longer ones, so long as they are clear. Add no more context to a name than is necessary.
> 只要短名称足够清楚,就要比长名称好。别给名称添加不必要的语境。
The names accountAddress and customerAddress are fine names for instances of the class Address but could be poor names for classes. Address is a fine name for a class. If I need to differentiate between MAC addresses, port addresses, and Web addresses, I might consider PostalAddress, MAC, and URI. The resulting names are more precise, which is the point of all naming.
> 对于 Address 类的实体来说,accountAddress 和 customerAddress 都是不错的名称,不过用在类名上就不太好了。Address 是个好类名。如果需要与 MAC 地址、端口地址和 Web 地址相区别,我会考虑使用 PostalAddress、MAC 和 URI。这样的名称更为精确,而精确正是命名的要点。
## 2.18 FINAL WORDS 最后的话
The hardest thing about choosing good names is that it requires good descriptive skills and a shared cultural background. This is a teaching issue rather than a technical, business, or management issue. As a result many people in this field don’t learn to do it very well.
> 取好名字最难的地方在于需要良好的描述技巧和共有文化背景。与其说这是一种技术、商业或管理问题,还不如说是一种教学问题。其结果是,这个领域内的许多人都没能学会做得很好。
People are also afraid of renaming things for fear that some other developers will object. We do not share that fear and find that we are actually grateful when names change (for the better). Most of the time we don’t really memorize the names of classes and methods. We use the modern tools to deal with details like that so we can focus on whether the code reads like paragraphs and sentences, or at least like tables and data structure (a sentence isn’t always the best way to display data). You will probably end up surprising someone when you rename, just like you might with any other code improvement. Don’t let it stop you in your tracks.
> 我们有时会怕其他开发者反对重命名。如果讨论一下就知道,如果名称改得更好,那大家真的会感激你。多数时候我们并不记忆类名和方法名。我们使用现代工具对付这些细节,好让自己集中精力于把代码写得就像词句篇章、至少像是表和数据结构(词句并非总是呈现数据的最佳手段)。改名可能会让某人吃惊,就像你做到其他代码改善工作一样。别让这种事阻碍你的前进步伐。
Follow some of these rules and see whether you don’t improve the readability of your code. If you are maintaining someone else’s code, use refactoring tools to help resolve these problems. It will pay off in the short term and continue to pay in the long run.
> 不妨试试上面这些规则,看你的代码可读性是否有所提升。如果你是在维护别人写的代码,使用重构工具来解决问题。效果立竿见影,而且会持续下去。
================================================
FILE: docs/ch3.md
================================================
# 第 3 章 Functions 函数

In the early days of programming we composed our systems of routines and subroutines. Then, in the era of Fortran and PL/1 we composed our systems of programs, subprograms, and functions. Nowadays only the function survives from those early days. Functions are the first line of organization in any program. Writing them well is the topic of this chapter.
> 在编程的早年岁月,系统由程序和子程序组成。后来,在 Fortran 和 PL/1 的年代,系统由程序、子程序和函数组成。如今,只有函数存活下来。函数是所有程序中的第一组代码。本章将讨论如何写好函数。
Consider the code in Listing 3-1. It’s hard to find a long function in FitNesse,1 but after a bit of searching I came across this one. Not only is it long, but it’s got duplicated code, lots of odd strings, and many strange and inobvious data types and APIs. See how much sense you can make of it in the next three minutes.
> 请看代码清单 3-1。在 FitNesse[1]中,很难找到长函数,不过我还是搜寻到一个。它不光长,而且代码也很复杂,有大量字符串、怪异而不显见的数据类型和 API。花 3 分钟时间,看能读懂多少?
1. An open-source testing tool. www.fitnesse.org
Listing 3-1 HtmlUtil.java (FitNesse 20070619)
> 代码清单 3-1 HtmlUtil.java(FitNesse 20070619)
```java
public static String testableHtml(
PageData pageData,
boolean includeSuiteSetup
) throws Exception {
WikiPage wikiPage = pageData.getWikiPage();
StringBuffer buffer = new StringBuffer();
if (pageData.hasAttribute("Test")) {
if (includeSuiteSetup) {
WikiPage suiteSetup =
PageCrawlerImpl.getInheritedPage(
SuiteResponder.SUITE_SETUP_NAME, wikiPage
);
if (suiteSetup != null) {
WikiPagePath pagePath =
suiteSetup.getPageCrawler().getFullPath(suiteSetup);
String pagePathName = PathParser.render(pagePath);
buffer.append("!include -setup .")
.append(pagePathName)
.append("\n");
}
}
WikiPage setup =
PageCrawlerImpl.getInheritedPage("SetUp", wikiPage);
if (setup != null) {
WikiPagePath setupPath =
wikiPage.getPageCrawler().getFullPath(setup);
String setupPathName = PathParser.render(setupPath);
buffer.append("!include -setup .")
.append(setupPathName)
.append("\n");
}
}
buffer.append(pageData.getContent());
if (pageData.hasAttribute("Test")) {
WikiPage teardown =
PageCrawlerImpl.getInheritedPage("TearDown", wikiPage);
if (teardown != null) {
WikiPagePath tearDownPath =
wikiPage.getPageCrawler().getFullPath(teardown);
String tearDownPathName = PathParser.render(tearDownPath);
buffer.append("\n")
.append("!include -teardown .")
.append(tearDownPathName)
.append("\n");
}
if (includeSuiteSetup) {
WikiPage suiteTeardown =
PageCrawlerImpl.getInheritedPage(
SuiteResponder.SUITE_TEARDOWN_NAME,
wikiPage
);
if (suiteTeardown != null) {
WikiPagePath pagePath =
suiteTeardown.getPageCrawler().getFullPath (suiteTeardown);
String pagePathName = PathParser.render(pagePath);
buffer.append("!include -teardown .")
.append(pagePathName)
.append("\n");
}
}
}
pageData.setContent(buffer.toString());
return pageData.getHtml();
}
```
Do you understand the function after three minutes of study? Probably not. There’s too much going on in there at too many different levels of abstraction. There are strange strings and odd function calls mixed in with doubly nested if statements controlled by flags.
> 搞懂这个函数了吗?大概没有。有太多事发生,有太多不同层级的抽象。奇怪的字符串和函数调用,混以双重嵌套、用标识来控制的 if 语句等,不一而足。
However, with just a few simple method extractions, some renaming, and a little restructuring, I was able to capture the intent of the function in the nine lines of Listing 3-2. See whether you can understand that in the next 3 minutes.
> 不过,只要做几个简单的方法抽离和重命名操作,加上一点点重构,就能在 9 行代码之内搞掂(如代码清单 3-2 所示)。用 3 分钟阅读以下代码,看你能理解吗?
Listing 3-2 HtmlUtil.java (refactored)
> 代码清单 3-2 HtmlUtil.java(重构之后)
```java
public static String renderPageWithSetupsAndTeardowns(
PageData pageData, boolean isSuite
) throws Exception {
boolean isTestPage = pageData.hasAttribute("Test");
if (isTestPage) {
WikiPage testPage = pageData.getWikiPage();
StringBuffer newPageContent = new StringBuffer();
includeSetupPages(testPage, newPageContent, isSuite);
newPageContent.append(pageData.getContent());
includeTeardownPages(testPage, newPageContent, isSuite);
pageData.setContent(newPageContent.toString());
}
return pageData.getHtml();
}
```
Unless you are a student of FitNesse, you probably don’t understand all the details. Still, you probably understand that this function performs the inclusion of some setup and teardown pages into a test page and then renders that page into HTML. If you are familiar with JUnit,2 you probably realize that this function belongs to some kind of Web-based testing framework. And, of course, that is correct. Divining that information from Listing 3-2 is pretty easy, but it’s pretty well obscured by Listing 3-1.
> 除非你正在研究 FitNesse,否则就理解不了所有细节。不过,你大概能明白,该函数包含把一些设置和拆解页放入一个测试页面,再渲染为 HTML 的操作。如果你熟悉 JUnit[2],或许会想到,该函数归属于某个基于 Web 的测试框架。而且,这当然没错。从代码清单 3-2 中获得信息很容易,而代码清单 3-1 则晦涩难明。
2. An open-source unit-testing tool for Java. www.junit.org
So what is it that makes a function like Listing 3-2 easy to read and understand? How can we make a function communicate its intent? What attributes can we give our functions that will allow a casual reader to intuit the kind of program they live inside?
> 是什么让代码清单 3-2 易于阅读和理解?怎么才能让函数表达其意图?该给函数赋予哪些属性,好让读者一看就明白函数是属于怎样的程序?
## 3.1 SMALL! 短小
The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. This is not an assertion that I can justify. I can’t provide any references to research that shows that very small functions are better. What I can tell you is that for nearly four decades I have written functions of all different sizes. I’ve written several nasty 3,000-line abominations. I’ve written scads of functions in the 100 to 300 line range. And I’ve written functions that were 20 to 30 lines long. What this experience has taught me, through long trial and error, is that functions should be very small.
> 函数的第一规则是要短小。第二条规则是还要更短小。我无法证明这个断言。我给不出任何证实了小函数更好的研究结果。我能说的是,近 40 年来,我写过各种不同大小的函数。我写过令人憎恶的长达 3000 行的厌物,也写过许多 100 行到 300 行的函数,我还写过 20 行到 30 行的。经过漫长的试错,经验告诉我,函数就该小。
In the eighties we used to say that a function should be no bigger than a screen-full. Of course we said that at a time when VT100 screens were 24 lines by 80 columns, and our editors used 4 lines for administrative purposes. Nowadays with a cranked-down font and a nice big monitor, you can fit 150 characters on a line and a 100 lines or more on a screen. Lines should not be 150 characters long. Functions should not be 100 lines long. Functions should hardly ever be 20 lines long.
> 在 20 世纪 80 年代,我们常说函数不该长于一屏。当然,说这话的时候,VT100 屏幕只有 24 行、80 列,而编辑器就得先占去 4 行空间放菜单。如今,用上了精致的字体和宽大的显示器,一屏里面可以显示 100 行,每行能容纳 150 个字符。每行都不应该有 150 个字符那么长。函数也不该有 100 行那么长,20 行封顶最佳。
How short should a function be? In 1999 I went to visit Kent Beck at his home in Oregon. We sat down and did some programming together. At one point he showed me a cute little Java/Swing program that he called Sparkle. It produced a visual effect on the screen very similar to the magic wand of the fairy godmother in the movie Cinderella. As you moved the mouse, the sparkles would drip from the cursor with a satisfying scintillation, falling to the bottom of the window through a simulated gravitational field. When Kent showed me the code, I was struck by how small all the functions were. I was used to functions in Swing programs that took up miles of vertical space. Every function in this program was just two, or three, or four lines long. Each was transparently obvious. Each told a story. And each led you to the next in a compelling order. That’s how short your functions should be!3
> 函数到底该有多长?1991 年,我去 Kent Beck 位于奥勒冈州(Oregon)的家中拜访。我们坐到一起写了些代码。他给我看一个叫做 Sparkle(火花闪耀)的有趣的 Java/Swing 小程序。程序在屏幕上描画电影 Cinderella(《灰姑娘》)中仙女用魔棒造出的那种视觉效果。只要移动鼠标,光标所在处就会爆发出一团令人欣喜的火花,沿着模拟重力场划落到窗口底部。肯特给我看代码的时候,我惊讶于其中那些函数尺寸之小。我看惯了 Swing 程序中长度数以里计的函数。但这个程序中每个函数都只有两行、三行或四行长。每个函数都一目了然。每个函数都只说一件事。而且,每个函数都依序把你带到下一个函数。这就是函数应该达到的短小程度![3]
3. I asked Kent whether he still had a copy, but he was unable to find one. I searched all my old computers too, but to no avail. All that is left now is my memory of that program.
How short should your functions be? They should usually be shorter than Listing 3-2! Indeed, Listing 3-2 should really be shortened to Listing 3-3.
> 函数应该有多短小?通常来说,应该短于代码清单 3-2 中的函数!代码清单 3-2 实在应该缩短成代码清单 3-3 这个样子。
Listing 3-3 HtmlUtil.java (re-refactored)
> 代码清单 3-3 HtmlUtil.java(再次重构之后)
```java
public static String renderPageWith
SetupsAndTeardowns(
PageData pageData, boolean isSuite) throws Exception {
if (isTestPage(pageData))
includeSetupAndTeardownPages(pageData, isSuite);
return pageData.getHtml();
}
```
### Blocks and Indenting 代码块和缩进
This implies that the blocks within if statements, else statements, while statements, and so on should be one line long. Probably that line should be a function call. Not only does this keep the enclosing function small, but it also adds documentary value because the function called within the block can have a nicely descriptive name.
> if 语句、else 语句、while 语句等,其中的代码块应该只有一行。该行大抵应该是一个函数调用语句。这样不但能保持函数短小,而且,因为块内调用的函数拥有较具说明性的名称,从而增加了文档上的价值。
This also implies that functions should not be large enough to hold nested structures. Therefore, the indent level of a function should not be greater than one or two. This, of course, makes the functions easier to read and understand.
> 这也意味着函数不应该大到足以容纳嵌套结构。所以,函数的缩进层级不该多于一层或两层。当然,这样的函数易于阅读和理解。
## 3.2 DO ONE THING 只做一件事
It should be very clear that Listing 3-1 is doing lots more than one thing. It’s creating buffers, fetching pages, searching for inherited pages, rendering paths, appending arcane strings, and generating HTML, among other things. Listing 3-1 is very busy doing lots of different things. On the other hand, Listing 3-3 is doing one simple thing. It’s including setups and teardowns into test pages.
> 代码清单 3-1 显然想做好几件事。它创建缓冲区、获取页面、搜索继承下来的页面、渲染路径、添加神秘的字符串、生成 HTML,如此等等。代码清单 3-1 手忙脚乱。而代码清单 3-3 则只做一件简单的事。它将设置和拆解包纳到测试页面中。
The following advice has appeared in one form or another for 30 years or more.
> 过去 30 年以来,以下建议以不同形式一再出现:

FUNCTIONS SHOULD DO ONE THING. THEY SHOULD DO IT WELL. THEY SHOULD DO IT ONLY.
> 函数应该做一件事。做好这件事。只做这一件事。
The problem with this statement is that it is hard to know what “one thing” is. Does Listing 3-3 do one thing? It’s easy to make the case that it’s doing three things:
> 问题在于很难知道那件该做的事是什么。代码清单 3-3 只做了一件事,对吧?其实也很容易看作是三件事:
1. Determining whether the page is a test page.
2. If so, including setups and teardowns.
3. Rendering the page in HTML.
---
> 1. 判断是否为测试页面;
> 2. 如果是,则容纳进设置和分拆步骤;
> 3. 渲染成 HTML。
So which is it? Is the function doing one thing or three things? Notice that the three steps of the function are one level of abstraction below the stated name of the function. We can describe the function by describing it as a brief TO4 paragraph:
> 那件事是什么?函数是做了一件事呢,还是做了三件事?注意,这三个步骤均在该函数名下的同一抽象层上。可以用简洁的 TO[4]起头段落来描述这个函数:
4. The LOGO language used the keyword “TO” in the same way that Ruby and Python use “def.” So every function began with the word “TO.” This had an interesting effect on the way functions were designed.
TO RenderPageWithSetupsAndTeardowns, we check to see whether the page is a test page and if so, we include the setups and teardowns. In either case we render the page in HTML.
> (要 RenderPageWithSetupsAndTeardowns,检查页面是否为测试页,如果是测试页,就容纳进设置和分拆步骤。无论是否测试页,都渲染成 HTML)
If a function does only those steps that are one level below the stated name of the function, then the function is doing one thing. After all, the reason we write functions is to decompose a larger concept (in other words, the name of the function) into a set of steps at the next level of abstraction.
> 如果函数只是做了该函数名下同一抽象层上的步骤,则函数还是只做了一件事。编写函数毕竟是为了把大一些的概念(换言之,函数的名称)拆分为另一抽象层上的一系列步骤。
It should be very clear that Listing 3-1 contains steps at many different levels of abstraction. So it is clearly doing more than one thing. Even Listing 3-2 has two levels of abstraction, as proved by our ability to shrink it down. But it would be very hard to meaningfully shrink Listing 3-3. We could extract the if statement into a function named includeSetupsAndTeardownsIfTestPage, but that simply restates the code without changing the level of abstraction.
> 代码清单 3-1 明显包括了处于多个不同抽象层级的步骤。显然,它所做的不止一件事。即便是代码清单 3-2 也有两个抽象层,这已被我们将其缩短的能力所证明。然而,很难再将代码清单 3-3 做有意义的缩短。可以将 if 语句拆出来做一个名为 includeSetupAndTeardonwsIfTestpage 的函数,但那只是重新诠释代码,并未改变抽象层级。
So, another way to know that a function is doing more than “one thing” is if you can extract another function from it with a name that is not merely a restatement of its implementation [G34].
> 所以,要判断函数是否不止做了一件事,还有一个方法,就是看是否能再拆出一个函数,该函数不仅只是单纯地重新诠释其实现[G34]。
### Sections within Functions 函数中的区段
Look at Listing 4-7 on page 71. Notice that the generatePrimes function is divided into sections such as declarations, initializations, and sieve. This is an obvious symptom of doing more than one thing. Functions that do one thing cannot be reasonably divided into sections.
> 请看代码清单 4-7。注意,generatePrimes 函数被切分为 declarations、initializations 和 sieve 等区段。这就是函数做事太多的明显征兆。只做一件事的函数无法被合理地切分为多个区段。
## 3.3 ONE LEVEL OF ABSTRACTION PER FUNCTION 每个函数一个抽象层级
In order to make sure our functions are doing “one thing,” we need to make sure that the statements within our function are all at the same level of abstraction. It is easy to see how Listing 3-1 violates this rule. There are concepts in there that are at a very high level of abstraction, such as getHtml(); others that are at an intermediate level of abstraction, such as: String pagePathName = PathParser.render(pagePath); and still others that are remarkably low level, such as: .append(”\n”).
> 要确保函数只做一件事,函数中的语句都要在同一抽象层级上。一眼就能看出,代码清单 3-1 违反了这条规矩。那里面有 getHtml( )等位于较高抽象层的概念,也有 String pagePathName = PathParser.render(pagePath)等位于中间抽象层的概念,还有.append("\n") 等位于相当低的抽象层的概念。
Mixing levels of abstraction within a function is always confusing. Readers may not be able to tell whether a particular expression is an essential concept or a detail. Worse, like broken windows, once details are mixed with essential concepts, more and more details tend to accrete within the function.
> 函数中混杂不同抽象层级,往往让人迷惑。读者可能无法判断某个表达式是基础概念还是细节。更恶劣的是,就像破损的窗户,一旦细节与基础概念混杂,更多的细节就会在函数中纠结起来。
### Reading Code from Top to Bottom: The Stepdown Rule 自顶向下读代码:向下规则
We want the code to read like a top-down narrative.5 We want every function to be followed by those at the next level of abstraction so that we can read the program, descending one level of abstraction at a time as we read down the list of functions. I call this The Step-down Rule.
> 我们想要让代码拥有自顶向下的阅读顺序。[5]我们想要让每个函数后面都跟着位于下一抽象层级的函数,这样一来,在查看函数列表时,就能偱抽象层级向下阅读了。我把这叫做向下规则。
5. [KP78], p. 37.
To say this differently, we want to be able to read the program as though it were a set of TO paragraphs, each of which is describing the current level of abstraction and referencing subsequent TO paragraphs at the next level down.
> 换一种说法。我们想要这样读程序:程序就像是一系列 TO 起头的段落,每一段都描述当前抽象层级,并引用位于下一抽象层级的后续 TO 起头段落。
- To include the setups and teardowns, we include setups, then we include the test page content, and then we include the teardowns.
- To include the setups, we include the suite setup if this is a suite, then we include the regular setup.
- To include the suite setup, we search the parent hierarchy for the “SuiteSetUp” page and add an include statement with the path of that page.
- To search the parent…
---
> - (要容纳设置和分拆步骤,就先容纳设置步骤,然后纳入测试页面内容,再纳入分拆步骤。)
> - (要容纳设置步骤,如果是套件,就纳入套件设置步骤,然后再纳入普通设置步骤。)
> - (要容纳套件设置步骤,先搜索“SuiteSetUp”页面的上级继承关系,再添加一个包括该页面路径的语句。)
> - (要搜索……)
It turns out to be very difficult for programmers to learn to follow this rule and write functions that stay at a single level of abstraction. But learning this trick is also very important. It is the key to keeping functions short and making sure they do “one thing.” Making the code read like a top-down set of TO paragraphs is an effective technique for keeping the abstraction level consistent.
> 程序员往往很难学会遵循这条规则,写出只停留于一个抽象层级上的函数。尽管如此,学习这个技巧还是很重要。这是保持函数短小、确保只做一件事的要诀。让代码读起来像是一系列自顶向下的 TO 起头段落是保持抽象层级协调一致的有效技巧。
Take a look at Listing 3-7 at the end of this chapter. It shows the whole testableHtml function refactored according to the principles described here. Notice how each function introduces the next, and each function remains at a consistent level of abstraction.
> 看看本章末尾的代码清单 3-7。它展示了遵循这条原则重构的完整 testableHtml 函数。留意每个函数是如何引出下一个函数,如何保持在同一抽象层上的。
## 3.4 SWITCH STATEMENTS switch 语句
It’s hard to make a small switch statement.6 Even a switch statement with only two cases is larger than I’d like a single block or function to be. It’s also hard to make a switch statement that does one thing. By their nature, switch statements always do N things. Unfortunately we can’t always avoid switch statements, but we can make sure that each switch statement is buried in a low-level class and is never repeated. We do this, of course, with polymorphism.
> 写出短小的 switch 语句很难[6]。即便是只有两种条件的 switch 语句也要比我想要的单个代码块或函数大得多。写出只做一件事的 switch 语句也很难。Switch 天生要做 N 件事。不幸我们总无法避开 switch 语句,不过还是能够确保每个 switch 都埋藏在较低的抽象层级,而且永远不重复。当然,我们利用多态来实现这一点。
6. And, of course, I include if/else chains in this.
Consider Listing 3-4. It shows just one of the operations that might depend on the type of employee.
> 请看代码清单 3-4。它呈现了可能依赖于雇员类型的仅仅一种操作。
Listing 3-4 Payroll.java
> 代码清单 3-4 Payroll.java
```java
public Money calculatePay(Employee e)
throws InvalidEmployeeType {
switch (e.type) {
case COMMISSIONED:
return calculateCommissionedPay(e);
case HOURLY:
return calculateHourlyPay(e);
case SALARIED:
return calculateSalariedPay(e);
default:
throw new InvalidEmployeeType(e.type);
}
}
```
There are several problems with this function. First, it’s large, and when new employee types are added, it will grow. Second, it very clearly does more than one thing. Third, it violates the Single Responsibility Principle7 (SRP) because there is more than one reason for it to change. Fourth, it violates the Open Closed Principle8 (OCP) because it must change whenever new types are added. But possibly the worst problem with this function is that there are an unlimited number of other functions that will have the same structure. For example we could have
> 该函数有好几个问题。首先,它太长,当出现新的雇员类型时,还会变得更长。其次,它明显做了不止一件事。第三,它违反了单一权责原则(Single Responsibility Principle[7], SRP),因为有好几个修改它的理由。第四,它违反了开放闭合原则(Open Closed Principle[8],OCP),因为每当添加新类型时,就必须修改之。不过,该函数最麻烦的可能是到处皆有类似结构的函数。例如,可能会有
7. a. http://en.wikipedia.org/wiki/Single_responsibility_principle
b. http://www.objectmentor.com/resources/articles/srp.pdf
8. a. http://en.wikipedia.org/wiki/Open/closed_principle
b. http://www.objectmentor.com/resources/articles/ocp.pdf
```java
isPayday(Employee e, Date date),
```
or
> 或
```java
deliverPay(Employee e, Money pay),
```
or a host of others. All of which would have the same deleterious structure.
> 如此等等。它们的结构都有同样的问题。
The solution to this problem (see Listing 3-5) is to bury the switch statement in the basement of an ABSTRACT FACTORY,9 and never let anyone see it. The factory will use the switch statement to create appropriate instances of the derivatives of Employee, and the various functions, such as calculatePay, isPayday, and deliverPay, will be dispatched polymorphically through the Employee interface.
> 该问题的解决方案(如代码清单 3-5 所示)是将 switch 语句埋到抽象工厂[9]底下,不让任何人看到。该工厂使用 switch 语句为 Employee 的派生物创建适当的实体,而不同的函数,如 calculatePay、isPayday 和 deliverPay 等,则藉由 Employee 接口多态地接受派遣。
9. [GOF].
My general rule for switch statements is that they can be tolerated if they appear only once, are used to create polymorphic objects, and are hidden behind an inheritance relationship so that the rest of the system can’t see them [G23]. Of course every circumstance is unique, and there are times when I violate one or more parts of that rule.
> 对于 switch 语句,我的规矩是如果只出现一次,用于创建多态对象,而且隐藏在某个继承关系中,在系统其他部分看不到,就还能容忍 [G23]。当然也要就事论事,有时我也会部分或全部违反这条规矩。
Listing 3-5 Employee and Factory
> 代码清单 3-5 Employee 与工厂
```java
public abstract class Employee {
public abstract boolean isPayday();
public abstract Money calculatePay();
public abstract void deliverPay(Money pay);
}
-----------------
public interface EmployeeFactory {
public Employee makeEmployee(EmployeeRecord r) throws InvalidEmployeeType;
}
-----------------
public class EmployeeFactoryImpl implements
EmployeeFactory {
public Employee makeEmployee(EmployeeRecord r) throws InvalidEmployeeType {
switch (r.type) {
case COMMISSIONED:
return new CommissionedEmployee(r) ;
case HOURLY:
return new HourlyEmployee(r);
case SALARIED:
return new SalariedEmploye(r);
default:
throw new InvalidEmployeeType(r.type);
}
}
}
```
## 3.5 USE DESCRIPTIVE NAMES 使用描述性的名称
In Listing 3-7 I changed the name of our example function from testableHtml to SetupTeardownIncluder.render. This is a far better name because it better describes what the function does. I also gave each of the private methods an equally descriptive name such as isTestable or includeSetupAndTeardownPages. It is hard to overestimate the value of good names. Remember Ward’s principle: “You know you are working on clean code when each routine turns out to be pretty much what you expected.” Half the battle to achieving that principle is choosing good names for small functions that do one thing. The smaller and more focused a function is, the easier it is to choose a descriptive name.
> 在代码清单 3-7 中,我把示例函数的名称从 testableHtml 改为 SetupTeardownIncluder.render。这个名称好得多,因为它较好地描述了函数做的事。我也给每个私有方法取个同样具有描述性的名称,如 isTestable 或 includeSetupAndTeardownPages。好名称的价值怎么好评都不为过。记住沃德原则:“如果每个例程都让你感到深合己意,那就是整洁代码。”要遵循这一原则,泰半工作都在于为只做一件事的小函数取个好名字。函数越短小、功能越集中,就越便于取个好名字。
Don’t be afraid to make a name long. A long descriptive name is better than a short enigmatic name. A long descriptive name is better than a long descriptive comment. Use a naming convention that allows multiple words to be easily read in the function names, and then make use of those multiple words to give the function a name that says what it does.
> 别害怕长名称。长而具有描述性的名称,要比短而令人费解的名称好。长而具有描述性的名称,要比描述性的长注释好。使用某种命名约定,让函数名称中的多个单词容易阅读,然后使用这些单词给函数取个能说清其功用的名称。
Don’t be afraid to spend time choosing a name. Indeed, you should try several different names and read the code with each in place. Modern IDEs like Eclipse or IntelliJ make it trivial to change names. Use one of those IDEs and experiment with different names until you find one that is as descriptive as you can make it.
> 别害怕花时间取名字。你当尝试不同的名称,实测其阅读效果。在 Eclipse 或 IntelliJ 等现代 IDE 中改名称易如反掌。使用这些 IDE 测试不同名称,直至找到最具有描述性的那一个为止。
Choosing descriptive names will clarify the design of the module in your mind and help you to improve it. It is not at all uncommon that hunting for a good name results in a favorable restructuring of the code.
> 选择描述性的名称能理清你关于模块的设计思路,并帮你改进之。追索好名称,往往导致对代码的改善重构。
Be consistent in your names. Use the same phrases, nouns, and verbs in the function names you choose for your modules. Consider, for example, the names includeSetup-AndTeardownPages, includeSetupPages, includeSuiteSetupPage, and includeSetupPage. The similar phraseology in those names allows the sequence to tell a story. Indeed, if I showed you just the sequence above, you’d ask yourself: “What happened to includeTeardownPages, includeSuiteTeardownPage, and includeTeardownPage?” How’s that for being “… pretty much what you expected.”
> 命名方式要保持一致。使用与模块名一脉相承的短语、名词和动词给函数命名。例如,includeSetupAndTeardownPages、includeSetupPages、includeSuiteSetupPage 和 includeSetupPage 等。这些名称使用了类似的措辞,依序讲出一个故事。实际上,假使我只给你看上述函数序列,你就会自问:“includeTeardownPages、includeSuiteTeardownPages 和 includeTeardownPage 又会如何?”这就是所谓“深合己意”了。
## 3.6 FUNCTION ARGUMENTS 函数参数
The ideal number of arguments for a function is zero (niladic). Next comes one (monadic), followed closely by two (dyadic). Three arguments (triadic) should be avoided where possible. More than three (polyadic) requires very special justification—and then shouldn’t be used anyway.
> 最理想的参数数量是零(零参数函数),其次是一(单参数函数),再次是二(双参数函数),应尽量避免三(三参数函数)。有足够特殊的理由才能用三个以上参数(多参数函数)——所以无论如何也不要这么做。

Arguments are hard. They take a lot of conceptual power. That’s why I got rid of almost all of them from the example. Consider, for instance, the StringBuffer in the example. We could have passed it around as an argument rather than making it an instance variable, but then our readers would have had to interpret it each time they saw it. When you are reading the story told by the module, includeSetupPage() is easier to understand than includeSetupPageInto(newPage-Content). The argument is at a different level of abstraction than the function name and forces you to know a detail (in other words, StringBuffer) that isn’t particularly important at that point.
> 参数不易对付。它们带有太多概念性。所以我在代码范例中几乎不加参数。比如,以 StringBuffer 为例,我们可能不把它作为实体变量,而是当作参数来传递,那样的话,读者每次看到它都得要翻译一遍。阅读模块所讲述的故事时,includeSetupPage()要比 includeSetupPageInto(newPage-Content)易于理解。参数与函数名处在不同的抽象层级,它要求你了解目前并不特别重要的细节(即那个 StringBuffer)。
Arguments are even harder from a testing point of view. Imagine the difficulty of writing all the test cases to ensure that all the various combinations of arguments work properly. If there are no arguments, this is trivial. If there’s one argument, it’s not too hard. With two arguments the problem gets a bit more challenging. With more than two arguments, testing every combination of appropriate values can be daunting.
> 从测试的角度看,参数甚至更叫人为难。想想看,要编写能确保参数的各种组合运行正常的测试用例,是多么困难的事。如果没有参数,就是小菜一碟。如果只有一个参数,也不太困难。有两个参数,问题就麻烦多了。如果参数多于两个,测试覆盖所有可能值的组合简直让人生畏。
Output arguments are harder to understand than input arguments. When we read a function, we are used to the idea of information going in to the function through arguments and out through the return value. We don’t usually expect information to be going out through the arguments. So output arguments often cause us to do a double-take.
> 输出参数比输入参数还要难以理解。读函数时,我们惯于认为信息通过参数输入函数,通过返回值从函数中输出。我们不太期望信息通过参数输出。所以,输出参数往往让人苦思之后才恍然大悟。
One input argument is the next best thing to no arguments. SetupTeardown-Includer.render(pageData) is pretty easy to understand. Clearly we are going to render the data in the pageData object.
> 相较于没有参数,只有一个输入参数算是第二好的做法。SetupTeardownInclude.render (pageData)也相当易于理解。很明显,我们将渲染 pageData 对象中的数据。
### 3.6.1 Common Monadic Forms 一元函数的普遍形式
There are two very common reasons to pass a single argument into a function. You may be asking a question about that argument, as in boolean fileExists(“MyFile”). Or you may be operating on that argument, transforming it into something else and returning it. For example, InputStream fileOpen(“MyFile”) transforms a file name String into an InputStream return value. These two uses are what readers expect when they see a function. You should choose names that make the distinction clear, and always use the two forms in a consistent context. (See Command Query Separation below.)
> 向函数传入单个参数有两种极普遍的理由。你也许会问关于那个参数的问题,就像在 boolean fileExists("MyFile")中那样。也可能是操作该参数,将其转换为其他什么东西,再输出之。例如,InputStream fileOpen("MyFile")把 String 类型的文件名转换为 InputStream 类型的返回值。这就是读者看到函数时所期待的东西。你应当选用较能区别这两种理由的名称,而且总在一致的上下文中使用这两种形式。
A somewhat less common, but still very useful form for a single argument function, is an event. In this form there is an input argument but no output argument. The overall program is meant to interpret the function call as an event and use the argument to alter the state of the system, for example, void passwordAttemptFailedNtimes(int attempts). Use this form with care. It should be very clear to the reader that this is an event. Choose names and contexts carefully.
> 还有一种虽不那么普遍但仍极有用的单参数函数形式,那就是事件(event)。在这种形式中,有输入参数而无输出参数。程序将函数看作是一个事件,使用该参数修改系统状态,例如 void passwordAttemptFailedNtimes(int attempts)。小心使用这种形式。应该让读者很清楚地了解它是个事件。谨慎地选用名称和上下文语境。
Try to avoid any monadic functions that don’t follow these forms, for example, void includeSetupPageInto(StringBuffer pageText). Using an output argument instead of a return value for a transformation is confusing. If a function is going to transform its input argument, the transformation should appear as the return value. Indeed, StringBuffer transform(StringBuffer in) is better than void transform(StringBuffer out), even if the implementation in the first case simply returns the input argument. At least it still follows the form of a transformation.
> 尽量避免编写不遵循这些形式的一元函数,例如,void includeSetupPageInto(StringBuffer pageText)。对于转换,使用输出参数而非返回值令人迷惑。如果函数要对输入参数进行转换操作,转换结果就该体现为返回值。实际上,StringBuffer transform(StringBuffer in)要比 void transform(StringBuffer out)强,即便第一种形式只简单地返回输参数也是这样。至少,它遵循了转换的形式。
### 3.6.2 Flag Arguments 标识参数
Flag arguments are ugly. Passing a boolean into a function is a truly terrible practice. It immediately complicates the signature of the method, loudly proclaiming that this function does more than one thing. It does one thing if the flag is true and another if the flag is false!
> 标识参数丑陋不堪。向函数传入布尔值简直就是骇人听闻的做法。这样做,方法签名立刻变得复杂起来,大声宣布本函数不止做一件事。如果标识为 true 将会这样做,标识为 false 则会那样做!
In Listing 3-7 we had no choice because the callers were already passing that flag in, and I wanted to limit the scope of refactoring to the function and below. Still, the method call render(true) is just plain confusing to a poor reader. Mousing over the call and seeing render(boolean isSuite) helps a little, but not that much. We should have split the function into two: renderForSuite() and renderForSingleTest().
> 在代码清单 3-7 中,我们别无选择,因为调用者已经传入了那个标识,而我想把重构范围限制在该函数及该函数以下范围之内。方法调用 render(true)对于可怜的读者来说仍然摸不着头脑。卷动屏幕,看到 render(Boolean isSuite),稍许有点帮助,不过仍然不够。应该把该函数一分为二:reanderForSuite( )和 renderForSingleTest( )。
### 3.6.3 Dyadic Functions 二元函数
A function with two arguments is harder to understand than a monadic function. For example, writeField(name) is easier to understand than writeField(output-Stream, name).10 Though the meaning of both is clear, the first glides past the eye, easily depositing its meaning. The second requires a short pause until we learn to ignore the first parameter. And that, of course, eventually results in problems because we should never ignore any part of code. The parts we ignore are where the bugs will hide.
> 有两个参数的函数要比一元函数难懂。例如,writeField(name)比 writeField(outputStream,name)[10]好懂。尽管两种情况下意义都很清楚,但第一个只要扫一眼就明白,更好地表达了其意义。第二个就得暂停一下才能明白,除非我们学会忽略第一个参数。而且最终那也会导致问题,因为我们根本就不该忽略任何代码。忽略掉的部分就是缺陷藏身之地。
10. I just finished refactoring a module that used the dyadic form. I was able to make the outputStream a field of the class and convert all the writeField calls to the monadic form. The result was much cleaner.
There are times, of course, where two arguments are appropriate. For example, Point p = new Point(0,0); is perfectly reasonable. Cartesian points naturally take two arguments. Indeed, we’d be very surprised to see new Point(0). However, the two arguments in this case are ordered components of a single value! Whereas output-Stream and name have neither a natural cohesion, nor a natural ordering.
> 当然,有些时候两个参数正好。例如,Point p = new Point(0,0);就相当合理。笛卡儿点天生拥有两个参数。如果看到 new Point(0),我们会倍感惊讶。然而,本例中的两个参数却只是单个值的有序组成部分!而 output-Stream 和 name 则既非自然的组合,也不是自然的排序。
Even obvious dyadic functions like assertEquals(expected, actual) are problematic. How many times have you put the actual where the expected should be? The two arguments have no natural ordering. The expected, actual ordering is a convention that requires practice to learn.
> 即便是如 assertEquals(expected, actual)这样的二元函数也有其问题。你有多少次会搞错 actual 和 expected 的位置呢?这两个参数没有自然的顺序。expected 在前,actual 在后,只是一种需要学习的约定罢了。
Dyads aren’t evil, and you will certainly have to write them. However, you should be aware that they come at a cost and should take advantage of what mechanisms may be available to you to convert them into monads. For example, you might make the writeField method a member of outputStream so that you can say outputStream. writeField(name). Or you might make the outputStream a member variable of the current class so that you don’t have to pass it. Or you might extract a new class like FieldWriter that takes the outputStream in its constructor and has a write method.
> 二元函数不算恶劣,而且你当然也会编写二元函数。不过,你得小心,使用二元函数要付出代价。你应该尽量利用一些机制将其转换成一元函数。例如,可以把 writeField 方法写成 outputStream 的成员之一,从而能这样用:outputStream.writeField(name)。或者,也可以把 outputStream 写成当前类的成员变量,从而无需再传递它。还可以分离出类似 FieldWriter 的新类,在其构造器中采用 outputStream,并且包含一个 write 方法。
### 3.6.4 Triads 三元函数
Functions that take three arguments are significantly harder to understand than dyads. The issues of ordering, pausing, and ignoring are more than doubled. I suggest you think very carefully before creating a triad.
> 有三个参数的函数要比二元函数难懂得多。排序、琢磨、忽略的问题都会加倍体现。建议你在写三元函数前一定要想清楚。
For example, consider the common overload of assertEquals that takes three arguments: assertEquals(message, expected, actual). How many times have you read the message and thought it was the expected? I have stumbled and paused over that particular triad many times. In fact, every time I see it, I do a double-take and then learn to ignore the message.
> 例如,设想 assertEquals 有三个参数:assertEquals(message, expected, actual)。有多少次,你读到 message,错以为它是 expected 呢?我就常栽在这个三元函数上。实际上,每次我看到这里,总会绕半天圈子,最后学会了忽略 message 参数。
On the other hand, here is a triad that is not quite so insidious: assertEquals(1.0, amount, .001). Although this still requires a double-take, it’s one that’s worth taking. It’s always good to be reminded that equality of floating point values is a relative thing.
> 另一方面,这里有个并不那么险恶的三元函数:assertEquals(1.0, amount, .001)。虽然也要费点神,还是值得的。得到“浮点值的等值是相对而言”的提示总是好的。
### 3.6.5 Argument Objects 参数对象
When a function seems to need more than two or three arguments, it is likely that some of those arguments ought to be wrapped into a class of their own. Consider, for example, the difference between the two following declarations:
> 如果函数看来需要两个、三个或三个以上参数,就说明其中一些参数应该封装为类了。例如,下面两个声明的差别:
```java
Circle makeCircle(double x, double y, double radius);
Circle makeCircle(Point center, double radius);
```
Reducing the number of arguments by creating objects out of them may seem like cheating, but it’s not. When groups of variables are passed together, the way x and y are in the example above, they are likely part of a concept that deserves a name of its own.
> 从参数创建对象,从而减少参数数量,看起来像是在作弊,但实则并非如此。当一组参数被共同传递,就像上例中的 x 和 y 那样,往往就是该有自己名称的某个概念的一部分。
### 3.6.6 Argument Lists 参数列表
Sometimes we want to pass a variable number of arguments into a function. Consider, for example, the String.format method:
> 有时,我们想要向函数传入数量可变的参数。例如,
```java
String.format(”%s worked %.2f hours.”, name, hours);
```
If the variable arguments are all treated identically, as they are in the example above, then they are equivalent to a single argument of type List. By that reasoning, String.format is actually dyadic. Indeed, the declaration of String.format as shown below is clearly dyadic.
> 如果可变参数像上例中那样被同等对待,就和类型为 List 的单个参数没什么两样。这样一来,String.formate 实则是二元函数。下列 String.format 的声明也很明显是二元的:
```java
public String format(String format, Object… args)
```
So all the same rules apply. Functions that take variable arguments can be monads, dyads, or even triads. But it would be a mistake to give them more arguments than that.
> 同理,有可变参数的函数可能是一元、二元甚至三元。超过这个数量就可能要犯错了。
```java
void monad(Integer… args);
void dyad(String name, Integer… args);
void triad(String name, int count, Integer… args);
```
### 3.6.7 Verbs and Keywords 动词与关键字
Choosing good names for a function can go a long way toward explaining the intent of the function and the order and intent of the arguments. In the case of a monad, the function and argument should form a very nice verb/noun pair. For example, write(name) is very evocative. Whatever this “name” thing is, it is being “written.” An even better name might be writeField(name), which tells us that the “name” thing is a “field.”
> 给函数取个好名字,能较好地解释函数的意图,以及参数的顺序和意图。对于一元函数,函数和参数应当形成一种非常良好的动词/名词对形式。例如,write(name)就相当令人认同。不管这个“name”是什么,都要被“write”。更好的名称大概是 writeField(name),它告诉我们,“name”是一个“field”。
This last is an example of the keyword form of a function name. Using this form we encode the names of the arguments into the function name. For example, assertEquals might be better written as assertExpectedEqualsActual(expected, actual). This strongly mitigates the problem of having to remember the ordering of the arguments.
> 最后那个例子展示了函数名称的关键字(keyword)形式。使用这种形式,我们把参数的名称编码成了函数名。例如,assertEqual 改成 assertExpectedEqualsActual(expected, actual)可能会好些。这大大减轻了记忆参数顺序的负担。
## 3.7 HAVE NO SIDE EFFECTS 无副作用
Side effects are lies. Your function promises to do one thing, but it also does other hidden things. Sometimes it will make unexpected changes to the variables of its own class. Sometimes it will make them to the parameters passed into the function or to system globals. In either case they are devious and damaging mistruths that often result in strange temporal couplings and order dependencies.
> 副作用是一种谎言。函数承诺只做一件事,但还是会做其他被藏起来的事。有时,它会对自己类中的变量做出未能预期的改动。有时,它会把变量搞成向函数传递的参数或是系统全局变量。无论哪种情况,都是具有破坏性的,会导致古怪的时序性耦合及顺序依赖。
Consider, for example, the seemingly innocuous function in Listing 3-6. This function uses a standard algorithm to match a userName to a password. It returns true if they match and false if anything goes wrong. But it also has a side effect. Can you spot it?
> 以代码清单 3-6 中看似无伤大雅的函数为例。该函数使用标准算法来匹配 userName 和 password。如果匹配成功,返回 true,如果失败则返回 false。但它会有副作用。你知道问题所在吗?
Listing 3-6 UserValidator.java
> 代码清单 3-6 UserValidator.java
```java
public class UserValidator {
private Cryptographer cryptographer;
public boolean checkPassword(String userName, String password) {
User user = UserGateway.findByName(userName);
if (user != User.NULL) {
String codedPhrase = user.
getPhraseEncodedByPassword();
String phrase = cryptographer.decrypt(codedPhrase, password);
if ("Valid Password".equals(phrase)) {
Session.initialize();
return true;
}
}
return false;
}
}
```
The side effect is the call to Session.initialize(), of course. The checkPassword function, by its name, says that it checks the password. The name does not imply that it initializes the session. So a caller who believes what the name of the function says runs the risk of erasing the existing session data when he or she decides to check the validity of the user.
> 当然了,副作用就在于对 Session.initialize( )的调用。checkPassword 函数,顾名思义,就是用来检查密码的。该名称并未暗示它会初始化该次会话。所以,当某个误信了函数名的调用者想要检查用户有效性时,就得冒抹除现有会话数据的风险。
This side effect creates a temporal coupling. That is, checkPassword can only be called at certain times (in other words, when it is safe to initialize the session). If it is called out of order, session data may be inadvertently lost. Temporal couplings are confusing, especially when hidden as a side effect. If you must have a temporal coupling, you should make it clear in the name of the function. In this case we might rename the function checkPasswordAndInitializeSession, though that certainly violates “Do one thing.”
> 这一副作用造出了一次时序性耦合。也就是说,checkPassword 只能在特定时刻调用(换言之,在初始化会话是安全的时候调用)。如果在不合适的时候调用,会话数据就有可能沉默地丢失。时序性耦合令人迷惑,特别是当它躲在副作用后面时。如果一定要时序性耦合,就应该在函数名称中说明。在本例中,可以重命名函数为 checkPasswordAndInitializeSession,虽然那还是违反了“只做一件事”的规则。
### Output Arguments 输出参数
Arguments are most naturally interpreted as inputs to a function. If you have been programming for more than a few years, I’m sure you’ve done a double-take on an argument that was actually an output rather than an input. For example:
> 参数多数会被自然而然地看作是函数的输入。如果你编过好些年程序,我担保你一定被用作输出而非输入的参数迷惑过。例如:
```java
appendFooter(s);
```
Does this function append s as the footer to something? Or does it append some footer to s? Is s an input or an output? It doesn’t take long to look at the function signature and see:
> 这个函数是把 s 添加到什么东西后面吗?或者它把什么东西添加到了 s 后面?s 是输入参数还是输出参数?稍许花点时间看看函数签名:
```java
public void appendFooter(StringBuffer report)
```
This clarifies the issue, but only at the expense of checking the declaration of the function. Anything that forces you to check the function signature is equivalent to a double-take. It’s a cognitive break and should be avoided.
> 事情清楚了,但付出了检查函数声明的代价。你被迫检查函数签名,就得花上一点时间。应该避免这种中断思路的事。
In the days before object oriented programming it was sometimes necessary to have output arguments. However, much of the need for output arguments disappears in OO languages because this is intended to act as an output argument. In other words, it would be better for appendFooter to be invoked as
> 在面向对象编程之前的岁月里,有时的确需要输出参数。然而,面向对象语言中对输出参数的大部分需求已经消失了,因为 this 也有输出函数的意味在内。换言之,最好是这样调用 appendFooter:
```java
report.appendFooter();
```
In general output arguments should be avoided. If your function must change the state of something, have it change the state of its owning object.
> 在面向对象编程之前的岁月里,有时的确需要输出参数。然而,面向对象语言中对输出参数的大部分需求已经消失了,因为 this 也有输出函数的意味在内。换言之,最好是这样调用 appendFooter:
## 3.8 COMMAND QUERY SEPARATION 分隔指令与询问
Functions should either do something or answer something, but not both. Either your function should change the state of an object, or it should return some information about that object. Doing both often leads to confusion. Consider, for example, the following function:
> 函数要么做什么事,要么回答什么事,但二者不可得兼。函数应该修改某对象的状态,或是返回该对象的有关信息。两样都干常会导致混乱。看看下面的例子:
```java
public boolean set(String attribute, String value);
```
This function sets the value of a named attribute and returns true if it is successful and false if no such attribute exists. This leads to odd statements like this:
> 该函数设置某个指定属性,如果成功就返回 true,如果不存在那个属性则返回 false。这样就导致了以下语句:
```java
if (set(”username”, ”unclebob”))…
```
Imagine this from the point of view of the reader. What does it mean? Is it asking whether the “username” attribute was previously set to “unclebob”? Or is it asking whether the “username” attribute was successfully set to “unclebob”? It’s hard to infer the meaning from the call because it’s not clear whether the word “set” is a verb or an adjective.
> 从读者的角度考虑一下吧。这是什么意思呢?它是在问 username 属性值是否之前已设置为 unclebob 吗?或者它是在问 username 属性值是否成功设置为 unclebob 呢?从这行调用很难判断其含义,因为 set 是动词还是形容词并不清楚。
The author intended set to be a verb, but in the context of the if statement it feels like an adjective. So the statement reads as “If the username attribute was previously set to unclebob” and not “set the username attribute to unclebob and if that worked then.…” We could try to resolve this by renaming the set function to setAndCheckIfExists, but that doesn’t much help the readability of the if statement. The real solution is to separate the command from the query so that the ambiguity cannot occur.
> 作者本意,set 是个动词,但在 if 语句的上下文中,感觉它像是个形容词。该语句读起来像是说“如果 username 属性值之前已被设置为 uncleob”,而不是“设置 username 属性值为 unclebob,看看是否可行,然后……”。要解决这个问题,可以将 set 函数重命名为 setAndCheckIfExists,但这对提高 if 语句的可读性帮助不大。真正的解决方案是把指令与询问分隔开来,防止混淆的发生:
```java
if (attributeExists(”username”)) {
setAttribute(”username”, ”unclebob”);
…
}
```
## 3.9 PREFER EXCEPTIONS TO RETURNING ERROR CODES 使用异常替代返回错误码
Returning error codes from command functions is a subtle violation of command query separation. It promotes commands being used as expressions in the predicates of if statements.
> 从指令式函数返回错误码轻微违反了指令与询问分隔的规则。它鼓励了在 if 语句判断中把指令当作表达式使用。
```java
if (deletePage(page) == E_OK)
```
This does not suffer from verb/adjective confusion but does lead to deeply nested structures. When you return an error code, you create the problem that the caller must deal with the error immediately.
> 这不会引起动词/形容词混淆,但却导致更深层次的嵌套结构。当返回错误码时,就是在要求调用者立刻处理错误。
```java
if (deletePage(page) == E_OK) {
if (registry.deleteReference(page.name) == E_OK) {
if (configKeys.deleteKey(page.name.makeKey()) == E_OK){
logger.log("page deleted");
} else {
logger.log("configKey not deleted");
}
} else {
logger.log("deleteReference from registry failed");
}
} else {
logger.log("delete failed");
return E_ERROR;
}
```
On the other hand, if you use exceptions instead of returned error codes, then the error processing code can be separated from the happy path code and can be simplified:
> 另一方面,如果使用异常替代返回错误码,错误处理代码就能从主路径代码中分离出来,得到简化:
```java
try {
deletePage(page);
registry.deleteReference(page.name);
configKeys.deleteKey(page.name.makeKey());
}
catch (Exception e) {
logger.log(e.getMessage());
}
```
### 3.9.1 Extract Try/Catch Blocks 抽离 Try/Catch 代码块
Try/catch blocks are ugly in their own right. They confuse the structure of the code and mix error processing with normal processing. So it is better to extract the bodies of the try and catch blocks out into functions of their own.
> Try/catch 代码块丑陋不堪。它们搞乱了代码结构,把错误处理与正常流程混为一谈。最好把 try 和 catch 代码块的主体部分抽离出来,另外形成函数。
```java
public void delete(Page page) {
try {
deletePageAndAllReferences(page);
}
catch (Exception e) {
logError(e);
}
}
private void deletePageAndAllReferences(Page page) throws Exception {
deletePage(page);
registry.deleteReference(page.name);
configKeys.deleteKey(page.name.makeKey());
}
private void logError(Exception e) {
logger.log(e.getMessage());
}
```
In the above, the delete function is all about error processing. It is easy to understand and then ignore. The deletePageAndAllReferences function is all about the processes of fully deleting a page. Error handling can be ignored. This provides a nice separation that makes the code easier to understand and modify.
> 在上例中,delete 函数只与错误处理有关。很容易理解然后就忽略掉。deletePageAndAllReference 函数只与完全删除一个 page 有关。错误处理可以忽略掉。有了这样美妙的区隔,代码就更易于理解和修改了。
### 3.9.2 Error Handling Is One Thing 错误处理就是一件事
Functions should do one thing. Error handing is one thing. Thus, a function that handles errors should do nothing else. This implies (as in the example above) that if the keyword try exists in a function, it should be the very first word in the function and that there should be nothing after the catch/finally blocks.
> 函数应该只做一件事。错误处理就是一件事。因此,处理错误的函数不该做其他事。这意味着(如上例所示)如果关键字 try 在某个函数中存在,它就该是这个函数的第一个单词,而且在 catch/finally 代码块后面也不该有其他内容。
### 3.9.3 The Error.java Dependency Magnet Error.java 依赖磁铁
Returning error codes usually implies that there is some class or enum in which all the error codes are defined.
> 返回错误码通常暗示某处有个类或是枚举,定义了所有错误码。
```java
public enum Error {
OK,
INVALID,
NO_SUCH,
LOCKED,
OUT_OF_RESOURCES,
WAITING_FOR_EVENT;
}
```
Classes like this are a dependency magnet; many other classes must import and use them. Thus, when the Error enum changes, all those other classes need to be recompiled and redeployed.11 This puts a negative pressure on the Error class. Programmers don’t want to add new errors because then they have to rebuild and redeploy everything. So they reuse old error codes instead of adding new ones.
> 这样的类就是一块依赖磁铁(dependency magnet);其他许多类都得导入和使用它。当 Error 枚举修改时,所有这些其他的类都需要重新编译和部署。[11]这对 Error 类造成了负面压力。程序员不愿增加新的错误代码,因为这样他们就得重新构建和部署所有东西。于是他们就复用旧的错误码,而不添加新的。
11. Those who felt that they could get away without recompiling and redeploying have been found—and dealt with.
When you use exceptions rather than error codes, then new exceptions are derivatives of the exception class. They can be added without forcing any recompilation or redeployment.12
> 使用异常替代错误码,新异常就可以从异常类派生出来,无需重新编译或重新部署[12]。
12. This is an example of the Open Closed Principle (OCP) [PPP02].
## 3.10 DON’T REPEAT YOURSELF13 别重复自己
13. The DRY principle. [PRAG].
Look back at Listing 3-1 carefully and you will notice that there is an algorithm that gets repeated four times, once for each of the SetUp, SuiteSetUp, TearDown, and SuiteTearDown cases. It’s not easy to spot this duplication because the four instances are intermixed with other code and aren’t uniformly duplicated. Still, the duplication is a problem because it bloats the code and will require four-fold modification should the algorithm ever have to change. It is also a four-fold opportunity for an error of omission.
> 回头仔细看看代码清单 3-1,你会注意到,有个算法在 SetUp、SuiteSetUp、TearDown 和 SuiteTearDown 中总共被重复了 4 次。识别重复不太容易,因为这 4 次重复与其他代码混在一起,而且也不完全一样。这样的重复还是会导致问题,因为代码因此而臃肿,且当算法改变时需要修改 4 处地方。而且也会增加 4 次放过错误的可能性。

This duplication was remedied by the include method in Listing 3-7. Read through that code again and notice how the readability of the whole module is enhanced by the reduction of that duplication.
> 使用代码清单 3-7 中的 include 方法修正了这些重复。再读一遍那段代码,你会注意到,整个模块的可读性因为重复的消除而得到了提升。
Duplication may be the root of all evil in software. Many principles and practices have been created for the purpose of controlling or eliminating it. Consider, for example, that all of Codd’s database normal forms serve to eliminate duplication in data. Consider also how object-oriented programming serves to concentrate code into base classes that would otherwise be redundant. Structured programming, Aspect Oriented Programming, Component Oriented Programming, are all, in part, strategies for eliminating duplication. It would appear that since the invention of the subroutine, innovations in software development have been an ongoing attempt to eliminate duplication from our source code.
> 重复可能是软件中一切邪恶的根源。许多原则与实践规则都是为控制与消除重复而创建。例如,全部考德(Codd)[14]数据库范式都是为消灭数据重复而服务。再想想看,面向对象编程是如何将代码集中到基类,从而避免了冗余。面向方面编程(Aspect Oriented Programming)、面向组件编程(Component Oriented Programming)多少也都是消除重复的一种策略。看来,自子程序发明以来,软件开发领域的所有创新都是在不断尝试从源代码中消灭重复。
## 3.11 STRUCTURED PROGRAMMING 结构化编程
Some programmers follow Edsger Dijkstra’s rules of structured programming.14 Dijkstra said that every function, and every block within a function, should have one entry and one exit. Following these rules means that there should only be one return statement in a function, no break or continue statements in a loop, and never, ever, any goto statements.
> 有些程序员遵循 Edsger Dijkstra 的结构化编程规则[15]。Dijkstra 认为,每个函数、函数中的每个代码块都应该有一个入口、一个出口。遵循这些规则,意味着在每个函数中只该有一个 return 语句,循环中不能有 break 或 continue 语句,而且永永远远不能有任何 goto 语句。
14. [SP72].
While we are sympathetic to the goals and disciplines of structured programming, those rules serve little benefit when functions are very small. It is only in larger functions that such rules provide significant benefit.
> 我们赞成结构化编程的目标和规范,但对于小函数,这些规则助益不大。只有在大函数中,这些规则才会有明显的好处。
So if you keep your functions small, then the occasional multiple return, break, or continue statement does no harm and can sometimes even be more expressive than the single-entry, single-exit rule. On the other hand, goto only makes sense in large functions, so it should be avoided.
> 所以,只要函数保持短小,偶尔出现的 return、break 或 continue 语句没有坏处,甚至还比单入单出原则更具有表达力。另外一方面,goto 只在大函数中才有道理,所以应该尽量避免使用。
## 3.12 HOW DO YOU WRITE FUNCTIONS LIKE THIS? 如何写出这样的函数
Writing software is like any other kind of writing. When you write a paper or an article, you get your thoughts down first, then you massage it until it reads well. The first draft might be clumsy and disorganized, so you wordsmith it and restructure it and refine it until it reads the way you want it to read.
> 写代码和写别的东西很像。在写论文或文章时,你先想什么就写什么,然后再打磨它。初稿也许粗陋无序,你就斟酌推敲,直至达到你心目中的样子。
When I write functions, they come out long and complicated. They have lots of indenting and nested loops. They have long argument lists. The names are arbitrary, and there is duplicated code. But I also have a suite of unit tests that cover every one of those clumsy lines of code.
> 我写函数时,一开始都冗长而复杂。有太多缩进和嵌套循环。有过长的参数列表。名称是随意取的,也会有重复的代码。不过我会配上一套单元测试,覆盖每行丑陋的代码。
So then I massage and refine that code, splitting out functions, changing names, eliminating duplication. I shrink the methods and reorder them. Sometimes I break out whole classes, all the while keeping the tests passing.
> 然后我打磨这些代码,分解函数、修改名称、消除重复。我缩短和重新安置方法。有时我还拆散类。同时保持测试通过。最后,遵循本章列出的规则,我组装好这些函数。
In the end, I wind up with functions that follow the rules I’ve laid down in this chapter. I don’t write them that way to start. I don’t think anyone could.
> 我并不从一开始就按照规则写函数。我想没人做得到。
## 3.13 CONCLUSION 小结
Every system is built from a domain-specific language designed by the programmers to describe that system. Functions are the verbs of that language, and classes are the nouns. This is not some throwback to the hideous old notion that the nouns and verbs in a requirements document are the first guess of the classes and functions of a system. Rather, this is a much older truth. The art of programming is, and has always been, the art of language design.
> 每个系统都是使用某种领域特定语言搭建,而这种语言是程序员设计来描述那个系统的。函数是语言的动词,类是名词。这并非是退回到那种认为需求文档中的名词和动词就是系统中类和函数的最初设想的可怕的旧观念。其实这是个历史更久的真理。编程艺术是且一直就是语言设计的艺术。
Master programmers think of systems as stories to be told rather than programs to be written. They use the facilities of their chosen programming language to construct a much richer and more expressive language that can be used to tell that story. Part of that domain-specific language is the hierarchy of functions that describe all the actions that take place within that system. In an artful act of recursion those actions are written to use the very domain-specific language they define to tell their own small part of the story.
> 大师级程序员把系统当作故事来讲,而不是当作程序来写。他们使用选定编程语言提供的工具构建一种更为丰富且更具表达力的语言,用来讲那个故事。那种领域特定语言的一个部分,就是描述在系统中发生的各种行为的函数层级。在一种狡猾的递归操作中,这些行为使用它们定义的与领域紧密相关的语言讲述自己那个小故事。
This chapter has been about the mechanics of writing functions well. If you follow the rules herein, your functions will be short, well named, and nicely organized. But never forget that your real goal is to tell the story of the system, and that the functions you write need to fit cleanly together into a clear and precise language to help you with that telling.
> 本章所讲述的是有关编写良好函数的机制。如果你遵循这些规则,函数就会短小,有个好名字,而且被很好地归置。不过永远别忘记,真正的目标在于讲述系统的故事,而你编写的函数必须干净利落地拼装到一起,形成一种精确而清晰的语言,帮助你讲故事。
## 3.14 SETUPTEARDOWNINCLUDER SetupTeardownIncluder 程序
Listing 3-7 SetupTeardownIncluder.java
> 代码清单 3-7 SetupTeardownIncluder.java
```java
package fitnesse.html;
import fitnesse.responders.run.SuiteResponder;
import fitnesse.wiki.*;
public class SetupTeardownIncluder {
private PageData pageData;
private boolean isSuite;
private WikiPage testPage;
private StringBuffer newPageContent;
private PageCrawler pageCrawler;
public static String render(PageData pageData) throws Exception {
return render(pageData, false);
}
public static String render(PageData pageData, boolean isSuite)
throws Exception {
return new SetupTeardownIncluder(pageData).render(isSuite);
}
private SetupTeardownIncluder(PageData pageData) {
this.pageData = pageData;
testPage = pageData.getWikiPage();
pageCrawler = testPage.getPageCrawler();
newPageContent = new StringBuffer();
}
private String render(boolean isSuite) throws Exception {
this.isSuite = isSuite;
if (isTestPage())
includeSetupAndTeardownPages();
return pageData.getHtml();
}
private boolean isTestPage() throws Exception {
return pageData.hasAttribute("Test");
}
private void includeSetupAndTeardownPages() throws Exception {
includeSetupPages();
includePageContent();
includeTeardownPages();
updatePageContent();
}
private void includeSetupPages() throws Exception {
if (isSuite)
includeSuiteSetupPage();
includeSetupPage();
}
private void includeSuiteSetupPage() throws Exception {
include(SuiteResponder.SUITE_SETUP_NAME, "-setup");
}
private void includeSetupPage() throws Exception {
include("SetUp", "-setup");
}
private void includePageContent() throws Exception {
newPageContent.append(pageData.getContent());
}
private void includeTeardownPages() throws Exception {
includeTeardownPage();
if (isSuite)
includeSuiteTeardownPage();
}
private void includeTeardownPage() throws Exception {
include("TearDown", "-teardown");
}
private void includeSuiteTeardownPage() throws Exception {
include(SuiteResponder.SUITE_TEARDOWN_NAME, "-teardown");
}
private void updatePageContent() throws Exception {
pageData.setContent(newPageContent.toString());
}
private void include(String pageName, String arg) throws Exception {
WikiPage inheritedPage = findInheritedPage(pageName);
if (inheritedPage != null) {
String pagePathName = getPathNameForPage(inheritedPage);
buildIncludeDirective(pagePathName, arg);
}
}
private WikiPage findInheritedPage(String pageName) throws Exception {
return PageCrawlerImpl.getInheritedPage(pageName, testPage);
}
private String getPathNameForPage(WikiPage page) throws Exception {
WikiPagePath pagePath = pageCrawler.getFullPath(page);
return PathParser.render(pagePath);
}
private void buildIncludeDirective(String pagePathName, String arg) {
newPageContent
.append("\n!include ")
.append(arg)
.append(" .")
.append(pagePathName)
.append("\n");
}
}
```
================================================
FILE: docs/ch4.md
================================================
# 第 4 章 Comments 注释

“Don’t comment bad code—rewrite it.”—Brian W. Kernighan and P. J. Plaugher1
> “别给糟糕的代码加注释——重新写吧。”——Brian W. Kernighan 与 P. J.
1. [KP78], p. 144.
Nothing can be quite so helpful as a well-placed comment. Nothing can clutter up a module more than frivolous dogmatic comments. Nothing can be quite so damaging as an old crufty comment that propagates lies and misinformation.
> 什么也比不上放置良好的注释来得有用。什么也不会比乱七八糟的注释更有本事搞乱一个模块。什么也不会比陈旧、提供错误信息的注释更有破坏性。
Comments are not like Schindler’s List. They are not “pure good.” Indeed, comments are, at best, a necessary evil. If our programming languages were expressive enough, or if we had the talent to subtly wield those languages to express our intent, we would not need comments very much—perhaps not at all.
> 注释并不像辛德勒的名单。它们并不“纯然地好”。实际上,注释最多也就是一种必须的恶。若编程语言足够有表达力,或者我们长于用这些语言来表达意图,就不那么需要注释——也许根本不需要。
The proper use of comments is to compensate for our failure to express ourself in code. Note that I used the word failure. I meant it. Comments are always failures. We must have them because we cannot always figure out how to express ourselves without them, but their use is not a cause for celebration.
> 注释的恰当用法是弥补我们在用代码表达意图时遭遇的失败。注意,我用了“失败”一词。我是说真的。注释总是一种失败。我们总无法找到不用注释就能表达自我的方法,所以总要有注释,这并不值得庆贺。
So when you find yourself in a position where you need to write a comment, think it through and see whether there isn’t some way to turn the tables and express yourself in code. Every time you express yourself in code, you should pat yourself on the back. Every time you write a comment, you should grimace and feel the failure of your ability of expression.
> 如果你发现自己需要写注释,再想想看是否有办法翻盘,用代码来表达。每次用代码表达,你都该夸奖一下自己。每次写注释,你都该做个鬼脸,感受自己在表达能力上的失败。
Why am I so down on comments? Because they lie. Not always, and not intentionally, but too often. The older a comment is, and the farther away it is from the code it describes, the more likely it is to be just plain wrong. The reason is simple. Programmers can’t realistically maintain them.
> 我为什么要极力贬低注释?因为注释会撒谎。也不是说总是如此或有意如此,但出现得实在太频繁。注释存在的时间越久,就离其所描述的代码越远,越来越变得全然错误。原因很简单。程序员不能坚持维护注释。
Code changes and evolves. Chunks of it move from here to there. Those chunks bifurcate and reproduce and come together again to form chimeras. Unfortunately the comments don’t always follow them—can’t always follow them. And all too often the comments get separated from the code they describe and become orphaned blurbs of ever-decreasing accuracy. For example, look what has happened to this comment and the line it was intended to describe:
> 代码在变动,在演化。从这里移到那里。彼此分离、重造又合到一处。很不幸,注释并不总是随之变动——不能总是跟着走。注释常常会与其所描述的代码分隔开来,孑然飘零,越来越不准确。例如,看看以下注释以及它本来要描述的代码行变成了什么样子:
```java
MockRequest request;
private final String HTTP_DATE_REGEXP =
“[SMTWF][a-z]{2}\\,\\s[0-9]{2}\\s[JFMASOND][a-z]{2}\\s”+
“[0-9]{4}\\s[0-9]{2}\\:[0-9]{2}\\:[0-9]{2}\\sGMT”;
private Response response;
private FitNesseContext context;
private FileResponder responder;
private Locale saveLocale;
// Example: ”Tue, 02 Apr 2003 22:18:49 GMT”
```
Other instance variables that were probably added later were interposed between the HTTP_DATE_REGEXP constant and it’s explanatory comment.
> 在 HTTP_DATE_REGEXP 常量及其注释之间,有可能插入其他实体变量。
It is possible to make the point that programmers should be disciplined enough to keep the comments in a high state of repair, relevance, and accuracy. I agree, they should. But I would rather that energy go toward making the code so clear and expressive that it does not need the comments in the first place.
> 程序员应当负责将注释保持在可维护、有关联、精确的高度。我同意这种说法。但我更主张把力气用在写清楚代码上,直接保证无须编写注释。
Inaccurate comments are far worse than no comments at all. They delude and mislead. They set expectations that will never be fulfilled. They lay down old rules that need not, or should not, be followed any longer.
> 不准确的注释要比没注释坏得多。它们满口胡言。它们预期的东西永不能实现。它们设定了无需也不应再遵循的旧规则。
Truth can only be found in one place: the code. Only the code can truly tell you what it does. It is the only source of truly accurate information. Therefore, though comments are sometimes necessary, we will expend significant energy to minimize them.
> 真实只在一处地方有:代码。只有代码能忠实地告诉你它做的事。那是唯一真正准确的信息来源。所以,尽管有时也需要注释,我们也该多花心思尽量减少注释量。
## 4.1 COMMENTS DO NOT MAKE UP FOR BAD CODE 注释不能美化糟糕的代码
One of the more common motivations for writing comments is bad code. We write a module and we know it is confusing and disorganized. We know it’s a mess. So we say to ourselves, “Ooh, I’d better comment that!” No! You’d better clean it!
> 写注释的常见动机之一是糟糕的代码的存在。我们编写一个模块,发现它令人困扰、乱七八糟。我们知道,它烂透了。我们告诉自己:“喔,最好写点注释!”不!最好是把代码弄干净!
Clear and expressive code with few comments is far superior to cluttered and complex code with lots of comments. Rather than spend your time writing the comments that explain the mess you’ve made, spend it cleaning that mess.
> 带有少量注释的整洁而有表达力的代码,要比带有大量注释的零碎而复杂的代码像样得多。与其花时间编写解释你搞出的糟糕的代码的注释,不如花时间清洁那堆糟糕的代码。
## 4.2 EXPLAIN YOURSELF IN CODE 用代码来阐述
There are certainly times when code makes a poor vehicle for explanation. Unfortunately, many programmers have taken this to mean that code is seldom, if ever, a good means for explanation. This is patently false. Which would you rather see? This:
> 有时,代码本身不足以解释其行为。不幸的是,许多程序员据此以为代码很少——如果有的话——能做好解释工作。这种观点纯属错误。
```java
// Check to see if the employee is eligible for full benefits
if ((employee.flags & HOURLY_FLAG) &&
(employee.age > 65))
```
Or this?
> 你愿意看到这个:
```java
if (employee.isEligibleForFullBenefits())
```
It takes only a few seconds of thought to explain most of your intent in code. In many cases it’s simply a matter of creating a function that says the same thing as the comment you want to write.
> 只要想上那么几秒钟,就能用代码解释你大部分的意图。很多时候,简单到只需要创建一个描述与注释所言同一事物的函数即可。
## 4.3 GOOD COMMENTS 好注释
Some comments are necessary or beneficial. We’ll look at a few that I consider worthy of the bits they consume. Keep in mind, however, that the only truly good comment is the comment you found a way not to write.
> 有些注释是必须的,也是有利的。来看看一些我认为值得写的注释。不过要记住,唯一真正好的注释是你想办法不去写的注释。
### 4.3.1 Legal Comments 法律信息
Sometimes our corporate coding standards force us to write certain comments for legal reasons. For example, copyright and authorship statements are necessary and reasonable things to put into a comment at the start of each source file.
> 有时,公司代码规范要求编写与法律有关的注释。例如,版权及著作权声明就是必须和有理由在每个源文件开头注释处放置的内容。
Here, for example, is the standard comment header that we put at the beginning of every source file in FitNesse. I am happy to say that our IDE hides this comment from acting as clutter by automatically collapsing it.
> 下例是我们在 FitNesse 项目每个源文件开头放置的标准注释。我可以很开心地说,IDE 自动卷起这些注释,这样就不会显得凌乱了。
```java
// Copyright (C) 2003,2004,2005 by Object Mentor, Inc. All rights reserved.
// Released under the terms of the GNU General Public License version 2 or later.
```
Comments like this should not be contracts or legal tomes. Where possible, refer to a standard license or other external document rather than putting all the terms and conditions into the comment.
> 这类注释不应是合同或法典。只要有可能,就指向一份标准许可或其他外部文档,而不要把所有条款放到注释中。
### 4.3.2 Informative Comments 提供信息的注释
It is sometimes useful to provide basic information with a comment. For example, consider this comment that explains the return value of an abstract method:
> 有时,用注释来提供基本信息也有其用处。例如,以下注释解释了某个抽象方法的返回值:
```java
// Returns an instance of the Responder being tested.
protected abstract Responder responderInstance();
```
A comment like this can sometimes be useful, but it is better to use the name of the function to convey the information where possible. For example, in this case the comment could be made redundant by renaming the function: responderBeingTested.
> 这类注释有时管用,但更好的方式是尽量利用函数名称传达信息。比如,在本例中,只要把函数重新命名为 responderBeingTested,注释就是多余的了。
Here’s a case that’s a bit better:
> 下例稍好一些:
```java
// format matched kk:mm:ss EEE, MMM dd, yyyy
Pattern timeMatcher = Pattern.compile(
“\\d*:\\d*:\\d* \\w*, \\w* \\d*, \\d*”);
```
In this case the comment lets us know that the regular expression is intended to match a time and date that were formatted with the SimpleDateFormat.format function using the specified format string. Still, it might have been better, and clearer, if this code had been moved to a special class that converted the formats of dates and times. Then the comment would likely have been superfluous.
> SimpleDateFormat.format 函数利用特定格式字符串格式化的时间和日期。同样,如果把这段代码移到某个转换日期和时间格式的类中,就会更好、更清晰,而注释也就变得多此一举了。
### 4.3.3 Explanation of Intent 对意图的解释
Sometimes a comment goes beyond just useful information about the implementation and provides the intent behind a decision. In the following case we see an interesting decision documented by a comment. When comparing two objects, the author decided that he wanted to sort objects of his class higher than objects of any other.
> 有时,注释不仅提供了有关实现的有用信息,而且还提供了某个决定后面的意图。在下例中,我们看到注释反映出来的一个有趣决定。在对比两个对象时,作者决定将他的类放置在比其他东西更高的位置。
```java
public int compareTo(Object o)
{
if(o instanceof WikiPagePath)
{
WikiPagePath p = (WikiPagePath) o;
String compressedName = StringUtil.join(names, “”);
String compressedArgumentName = StringUtil.join(p.names, “”);
return compressedName.compareTo(compressedArgumentName);
}
return 1; // we are greater because we are the right type.
}
```
Here’s an even better example. You might not agree with the programmer’s solution to the problem, but at least you know what he was trying to do.
> 下面的例子甚至更好。你也许不同意程序员给这个问题提供的解决方案,但至少你知道他想干什么。
```java
public void testConcurrentAddWidgets() throws Exception {
WidgetBuilder widgetBuilder =
new WidgetBuilder(new Class[]{BoldWidget.class});
String text = ”’’’bold text’’’”;
ParentWidget parent =
new BoldWidget(new MockWidgetRoot(), ”’’’bold text’’’”);
AtomicBoolean failFlag = new AtomicBoolean();
failFlag.set(false);
//This is our best attempt to get a race condition
//by creating large number of threads.
for (int i = 0; i < 25000; i++) {
WidgetBuilderThread widgetBuilderThread =
new WidgetBuilderThread(widgetBuilder, text, parent, failFlag);
Thread thread = new Thread(widgetBuilderThread);
thread.start();
}
assertEquals(false, failFlag.get());
}
```
### 4.3.4 Clarification 阐释
Sometimes it is just helpful to translate the meaning of some obscure argument or return value into something that’s readable. In general it is better to find a way to make that argument or return value clear in its own right; but when its part of the standard library, or in code that you cannot alter, then a helpful clarifying comment can be useful.
> 有时,注释把某些晦涩难明的参数或返回值的意义翻译为某种可读形式,也会是有用的。通常,更好的方法是尽量让参数或返回值自身就足够清楚;但如果参数或返回值是某个标准库的一部分,或是你不能修改的代码,帮助阐释其含义的代码就会有用。
```java
public void testCompareTo() throws Exception
{
WikiPagePath a = PathParser.parse("PageA");
WikiPagePath ab = PathParser.parse("PageA.PageB");
WikiPagePath b = PathParser.parse("PageB");
WikiPagePath aa = PathParser.parse("PageA.PageA");
WikiPagePath bb = PathParser.parse("PageB.PageB");
WikiPagePath ba = PathParser.parse("PageB.PageA");
assertTrue(a.compareTo(a) == 0); // a == a
assertTrue(a.compareTo(b) != 0); // a != b
assertTrue(ab.compareTo(ab) == 0); // ab == ab
assertTrue(a.compareTo(b) == -1); // a < b
assertTrue(aa.compareTo(ab) == -1); // aa < ab
assertTrue(ba.compareTo(bb) == -1); // ba < bb
assertTrue(b.compareTo(a) == 1); // b > a
assertTrue(ab.compareTo(aa) == 1); // ab > aa
assertTrue(bb.compareTo(ba) == 1); // bb > ba
}
```
There is a substantial risk, of course, that a clarifying comment is incorrect. Go through the previous example and see how difficult it is to verify that they are correct. This explains both why the clarification is necessary and why it’s risky. So before writing comments like this, take care that there is no better way, and then take even more care that they are accurate.
> 当然,这也会冒阐释性注释本身就不正确的风险。回头看看上例,你会发现想要确认注释的正确性有多难。这一方面说明了阐释有多必要,另外也说明了它有风险。所以,在写这类注释之前,考虑一下是否还有更好的办法,然后再加倍小心地确认注释正确性。
### 4.3.5 Warning of Consequences 警示
Sometimes it is useful to warn other programmers about certain consequences. For example, here is a comment that explains why a particular test case is turned off:
> 有时,用于警告其他程序员会出现某种后果的注释也是有用的。例如,下面的注释解释了为什么要关闭某个特定的测试用例:

```java
// Don't run unless you
// have some time to kill.
public void _testWithReallyBigFile()
{
writeLinesToFile(10000000);
response.setBody(testFile);
response.readyToSend(this);
String responseString = output.toString();
assertSubString("Content-Length: 1000000000", responseString);
assertTrue(bytesSent > 1000000000);
}
```
Nowadays, of course, we’d turn off the test case by using the @Ignore attribute with an appropriate explanatory string. @Ignore(”Takes too long to run”). But back in the days before JUnit 4, putting an underscore in front of the method name was a common convention. The comment, while flippant, makes the point pretty well.
> 当然,如今我们多数会利用附上恰当解释性字符串的@Ignore 属性来关闭测试用例。比如@Ignore("Takes too long to run[2]")。但在 JUnit4 之前的日子里,惯常的做法是在方法名前面加上下划线。如果注释足够有说服力,就会很有用了。
Here’s another, more poignant example:
> 这里有个更麻烦的例子:
```java
public static
SimpleDateFormat makeStandardHttpDateFormat()
{
//SimpleDateFormat is not thread safe,
//so we need to create each instance independently.
SimpleDateFormat df = new SimpleDateFormat(”EEE, dd MMM yyyy HH:mm:ss z”);
df.setTimeZone(TimeZone.getTimeZone(”GMT”));
return df;
}
```
You might complain that there are better ways to solve this problem. I might agree with you. But the comment, as given here, is perfectly reasonable. It will prevent some overly eager programmer from using a static initializer in the name of efficiency.
> 你也许会抱怨说,还会有更好的解决方法。我大概会同意。不过上面的注释绝对有道理存在,它能阻止某位急切的程序员以效率之名使用静态初始器。
### 4.3.6 TODO Comments TODO 注释
It is sometimes reasonable to leave “To do” notes in the form of //TODO comments. In the following case, the TODO comment explains why the function has a degenerate implementation and what that function’s future should be.
> 有时,有理由用//TODO 形式在源代码中放置要做的工作列表。在下例中,TODO 注释解释了为什么该函数的实现部分无所作为,将来应该是怎样。
```java
//TODO-MdM these are not needed
// We expect this to go away when we do the checkout model
protected VersionInfo makeVersion() throws Exception
{
return null;
}
```
TODOs are jobs that the programmer thinks should be done, but for some reason can’t do at the moment. It might be a reminder to delete a deprecated feature or a plea for someone else to look at a problem. It might be a request for someone else to think of a better name or a reminder to make a change that is dependent on a planned event. Whatever else a TODO might be, it is not an excuse to leave bad code in the system.
> TODO 是一种程序员认为应该做,但由于某些原因目前还没做的工作。它可能是要提醒删除某个不必要的特性,或者要求他人注意某个问题。它可能是恳请别人取个好名字,或者提示对依赖于某个计划事件的修改。无论 TODO 的目的如何,它都不是在系统中留下糟糕的代码的借口。
Nowadays, most good IDEs provide special gestures and features to locate all the TODO comments, so it’s not likely that they will get lost. Still, you don’t want your code to be littered with TODOs. So scan through them regularly and eliminate the ones you can.
> 如今,大多数好 IDE 都提供了特别的手段来定位所有 TODO 注释,这些注释看来丢不了。你不会愿意代码因为 TODO 的存在而变成一堆垃圾,所以要定期查看,删除不再需要的。
### 4.3.7 Amplification 放大
A comment may be used to amplify the importance of something that may otherwise seem inconsequential.
> 注释可以用来放大某种看来不合理之物的重要性。
```java
String listItemContent = match.group(3).trim();
// the trim is real important. It removes the starting
// spaces that could cause the item to be recognized
// as another list.
new ListItemWidget(this, listItemContent, this.level + 1);
return buildList(text.substring(match.end()));
```
### 4.3.8 Javadocs in Public APIs 公共 API 中的 Javadoc
There is nothing quite so helpful and satisfying as a well-described public API. The java-docs for the standard Java library are a case in point. It would be difficult, at best, to write Java programs without them.
> 没有什么比被良好描述的公共 API 更有用和令人满意的了。标准 Java 库中的 Javadoc 就是一例。没有它们,写 Java 程序就会变得很难。
If you are writing a public API, then you should certainly write good javadocs for it. But keep in mind the rest of the advice in this chapter. Javadocs can be just as misleading, nonlocal, and dishonest as any other kind of comment.
> 如果你在编写公共 API,就该为它编写良好的 Javadoc。不过要记住本章中的其他建议。就像其他注释一样,Javadoc 也可能误导、不适用或者提供错误信息。
## 4.4 BAD COMMENTS 坏注释
Most comments fall into this category. Usually they are crutches or excuses for poor code or justifications for insufficient decisions, amounting to little more than the programmer talking to himself.
> 大多数注释都属此类。通常,坏注释都是糟糕的代码的支撑或借口,或者对错误决策的修正,基本上等于程序员自说自话。
### 4.4.1 Mumbling 喃喃自语
Plopping in a comment just because you feel you should or because the process requires it, is a hack. If you decide to write a comment, then spend the time necessary to make sure it is the best comment you can write.
> 如果只是因为你觉得应该或者因为过程需要就添加注释,那就是无谓之举。如果你决定写注释,就要花必要的时间确保写出最好的注释。
Here, for example, is a case I found in FitNesse, where a comment might indeed have been useful. But the author was in a hurry or just not paying much attention. His mumbling left behind an enigma:
> 例如,我在 FitNesse 中找到的这个例子,例中的注释大概确实有用。不过,作者太着急,或者没太花心思。他的喃喃自语变成了一个谜团。
```java
public void loadProperties()
{
try
{
String propertiesPath = propertiesLocation +
”/” + PROPERTIES_FILE;
FileInputStream propertiesStream = new
FileInputStream(propertiesPath);
loadedProperties.load(propertiesStream);
}
catch(IOException e)
{
// No properties files means all defaults are loaded
}
}
```
What does that comment in the catch block mean? Clearly it meant something to the author, but the meaning does not come through all that well. Apparently, if we get an IOException, it means that there was no properties file; and in that case all the defaults are loaded. But who loads all the defaults? Were they loaded before the call to loadProperties.load? Or did loadProperties.load catch the exception, load the defaults, and then pass the exception on for us to ignore? Or did loadProperties.load load all the defaults before attempting to load the file? Was the author trying to comfort himself about the fact that he was leaving the catch block empty? Or—and this is the scary possibility—was the author trying to tell himself to come back here later and write the code that would load the defaults?
> catch 代码块中的注释是什么意思呢?显然对于作者有其意义,不过并没有好到足够的程度。很明显,如果出现 IOException,就表示没有属性文件;在那种情况下,载入默认设置。但谁来装载默认设置呢?会在对 loadProperties.load 之前装载吗?抑或 loadProperties.load 捕获异常、装载默认设置、再向上传递异常?再或 loadProperties.load 在尝试载入文件前就装载所有默认设置?要么作者只是在安慰自己别在意 catch 代码块的留空?
Our only recourse is to examine the code in other parts of the system to find out what’s going on. Any comment that forces you to look in another module for the meaning of that comment has failed to communicate to you and is not worth the bits it consumes.
> 或者——这种可能最可怕——作者是想告诉自己,将来再回头写装载默认设置的代码?我们唯有检视系统其他部分的代码,弄清事情原委。任何迫使读者查看其他模块的注释,都没能与读者沟通好,不值所费。
### 4.4.2 Redundant Comments 多余的注释
Listing 4-1 shows a simple function with a header comment that is completely redundant. The comment probably takes longer to read than the code itself.
> 代码清单 4-1 展示的简单函数,其头部位置的注释全属多余。读这段注释花的时间没准比读代码花的时间还要长。
Listing 4-1 waitForClose
> 代码清单 4-1 waitForClose
```java
// Utility method that returns when this.closed
is true. Throws an exception
// if the timeout is reached.
public synchronized void waitForClose(final long timeoutMillis)
throws Exception
{
if(!closed)
{
wait(timeoutMillis);
if(!closed)
throw new Exception("MockResponseSender could not be closed");
}
}
```
What purpose does this comment serve? It’s certainly not more informative than the code. It does not justify the code, or provide intent or rationale. It is not easier to read than the code. Indeed, it is less precise than the code and entices the reader to accept that lack of precision in lieu of true understanding. It is rather like a gladhanding used-car salesman assuring you that you don’t need to look under the hood.
> 这段注释起了什么作用?它并不能比代码本身提供更多的信息。它没有证明代码的意义,也没有给出代码的意图或逻辑。读它并不比读代码更容易。事实上,它不如代码精确,误导读者接受不精确的信息,而不是正确地理解代码。它就像个自来熟的二手车贩子,满口保证你不用打开发动机盖查验。来看看代码清单 4-2 中摘自 Tomcat 项目的无用而多余的 Javadoc 吧。
Now consider the legion of useless and redundant javadocs in Listing 4-2 taken from Tomcat. These comments serve only to clutter and obscure the code. They serve no documentary purpose at all. To make matters worse, I only showed you the first few. There are many more in this module.
> 这些注释只是一味将代码搞得含糊不明。完全没有文档上的价值。下面只列出了靠前面的一些代码,后续模块中还有许多类似情况。
Listing 4-2 ContainerBase.java (Tomcat)
> 代码清单 4-2 ContainerBase.java (Tomcat)
```java
public abstract class ContainerBase
implements Container, Lifecycle, Pipeline,
MBeanRegistration, Serializable {
/**
* The processor delay for this component.
*/
protected int backgroundProcessorDelay = -1;
/**
* The lifecycle event support for this component.
*/
protected LifecycleSupport lifecycle =
new LifecycleSupport(this);
/**
* The container event listeners for this Container.
*/
protected ArrayList listeners = new ArrayList();
/**
* The Loader implementation with which this Container is
* associated.
*/
protected Loader loader = null;
/**
* The Logger implementation with which this Container is
* associated.
*/
protected Log logger = null;
/**
* Associated logger name.
*/
protected String logName = null;
/**
* The Manager implementation with which this Container is
* associated.
*/
protected Manager manager = null;
/**
* The cluster with which this Container is associated.
*/
protected Cluster cluster = null;
/**
* The human-readable name of this Container.
*/
protected String name = null;
/**
* The parent Container to which this Container is a child.
*/
protected Container parent = null;
/**
* The parent class loader to be configured when we install a
* Loader.
*/
protected ClassLoader parentClassLoader = null;
/**
* The Pipeline object with which this Container is
* associated.
*/
protected Pipeline pipeline = new StandardPipeline(this);
/**
* The Realm with which this Container is associated.
*/
protected Realm realm = null;
/**
* The resources DirContext object with which this Container
* is associated.
*/
protected DirContext resources = null;
```
### 4.4.3 Misleading Comments 误导性注释
Sometimes, with all the best intentions, a programmer makes a statement in his comments that isn’t precise enough to be accurate. Consider for another moment the badly redundant but also subtly misleading comment we saw in Listing 4-1.
> 有时,尽管初衷可嘉,程序员还是会写出不够精确的注释。想想看代码清单 4-1 中那多余而又有误导嫌疑的注释吧。
Did you discover how the comment was misleading? The method does not return when this.closed becomes true. It returns if this.closed is true; otherwise, it waits for a blind time-out and then throws an exception if this.closed is still not true.
> 你有没有发现那样的注释是如何误导读者的?在 this.closed 变为 true 的时候,方法并没有返回。方法只在判断到 this.closed 为 true 的时候返回,否则,就只是等待遥遥无期的超时,然后如果判断 this.closed 还是非 true,就抛出一个异常。
This subtle bit of misinformation, couched in a comment that is harder to read than the body of the code, could cause another programmer to blithely call this function in the expectation that it will return as soon as this.closed becomes true. That poor programmer would then find himself in a debugging session trying to figure out why his code executed so slowly.
> 这一细微的误导信息,放在比代码本身更难阅读的注释里面,有可能导致其他程序员快活地调用这个函数,并期望在 this.closed 变为 true 时立即返回。那位可怜的程序员将会发现自己陷于调试困境之中,拼命想找出代码执行得如此之慢的原因。
### 4.4.4 Mandated Comments 循规式注释
It is just plain silly to have a rule that says that every function must have a javadoc, or every variable must have a comment. Comments like this just clutter up the code, propagate lies, and lend to general confusion and disorganization.
> 所谓每个函数都要有 Javadoc 或每个变量都要有注释的规矩全然是愚蠢可笑的。这类注释徒然让代码变得散乱,满口胡言,令人迷惑不解。
For example, required javadocs for every function lead to abominations such as Listing 4-3. This clutter adds nothing and serves only to obfuscate the code and create the potential for lies and misdirection.
> 例如,要求每个函数都要有 Javadoc,就会得到类似代码清单 4-3 那样面目可憎的代码。这类废话只会搞乱代码,有可能误导读者。
Listing 4-3
> 代码清单 4-3
```java
/**
*
* @param title The title of the CD
* @param author The author of the CD
* @param tracks The number of tracks on the CD
* @param durationInMinutes The duration of the CD in minutes
*/
public void addCD(String title, String author,
int tracks, int durationInMinutes) {
CD cd = new CD();
cd.title = title;
cd.author = author;
cd.tracks = tracks;
cd.duration = duration;
cdList.add(cd);
}
```
### 4.4.5 Journal Comments 日志式注释
Sometimes people add a comment to the start of a module every time they edit it. These comments accumulate as a kind of journal, or log, of every change that has ever been made. I have seen some modules with dozens of pages of these run-on journal entries.
> 有人会在每次编辑代码时,在模块开始处添加一条注释。这类注释就像是一种记录每次修改的日志。我见过满篇尽是这类日志的代码模块。
```java
* Changes (from 11-Oct-2001)
* --------------------------
* 11-Oct-2001 : Re-organised the class and moved it to new package
* com.jrefinery.date (DG);
* 05-Nov-2001 : Added a getDescription() method, and eliminated NotableDate
* class (DG);
* 12-Nov-2001 : IBD requires setDescription() method, now that NotableDate
* class is gone (DG); Changed getPreviousDayOfWeek(),
* getFollowingDayOfWeek() and getNearestDayOfWeek() to correct
* bugs (DG);
* 05-Dec-2001 : Fixed bug in SpreadsheetDate class (DG);
* 29-May-2002 : Moved the month constants into a separate interface
* (MonthConstants) (DG);
* 27-Aug-2002 : Fixed bug in addMonths() method, thanks to N???levka Petr (DG);
* 03-Oct-2002 : Fixed errors reported by Checkstyle (DG);
* 13-Mar-2003 : Implemented Serializable (DG);
* 29-May-2003 : Fixed bug in addMonths method (DG);
* 04-Sep-2003 : Implemented Comparable. Updated the isInRange javadocs (DG);
* 05-Jan-2005 : Fixed bug in addYears() method (1096282) (DG);
```
Long ago there was a good reason to create and maintain these log entries at the start of every module. We didn’t have source code control systems that did it for us. Nowadays, however, these long journals are just more clutter to obfuscate the module. They should be completely removed.
> 很久以前,在模块开始处创建并维护这些记录还算有道理。那时,我们还没有源代码控制系统可用。如今,这种冗长的记录只会让模块变得凌乱不堪,应当全部删除。
### 4.4.6 Noise Comments 废话注释
Sometimes you see comments that are nothing but noise. They restate the obvious and provide no new information.
> 有时,你会看到纯然是废话的注释。它们对于显然之事喋喋不休,毫无新意。
```java
/**
* Default constructor.
*/
protected AnnualDateRule() {
}
```
No, really? Or how about this:
> 对吧?再看看这个:
```java
/** The day of the month. */
private int dayOfMonth;
```
And then there’s this paragon of redundancy:
> 这样的废话模范:
```java
/**
* Returns the day of the month.
*
* @return the day of the month.
*/
public int getDayOfMonth() {
return dayOfMonth;
}
```
These comments are so noisy that we learn to ignore them. As we read through code, our eyes simply skip over them. Eventually the comments begin to lie as the code around them changes.
> 这类注释废话连篇,我们都学会了视而不见。读代码时,眼光不会停留在它们上面。最终,当代码修改之后,这类注释就变作了谎言一堆。
The first comment in Listing 4-4 seems appropriate.2 It explains why the catch block is being ignored. But the second comment is pure noise. Apparently the programmer was just so frustrated with writing try/catch blocks in this function that he needed to vent.
> 代码清单 4-4 中的第一条注释貌似还行[3]。它解释了 catch 代码块为何被忽略。不过第二条注释就纯是废话了。显然,该程序员沮丧于编写函数中那些 try/catch 代码块。
2. The current trend for IDEs to check spelling in comments will be a balm for those of us who read a lot of code.
Listing 4-4 startSending
> 代码清单 4-4 startSending
```java
private void startSending()
{
try
{
doSending();
}
catch(SocketException e)
{
// normal. someone stopped the request.
}
catch(Exception e)
{
try
{
response.add(ErrorResponder.makeExceptionString(e));
response.closeAll();
}
catch(Exception e1)
{
//Give me a break!
}
}
}
```
Rather than venting in a worthless and noisy comment, the programmer should have recognized that his frustration could be resolved by improving the structure of his code. He should have redirected his energy to extracting that last try/catch block into a separate function, as shown in Listing 4-5.
> 与其纠缠于毫无价值的废话注释,程序员应该意识到,他的挫败感可以由改进代码结构而消除。他应该把力气花在将最末一个 try/catch 代码块拆解到单独的函数中,如代码清单 4-5 所示。
Listing 4-5 startSending (refactored)
> 代码清单 4-5 startSending(重构之后)
```java
private void startSending()
{
try
{
doSending();
}
catch(SocketException e)
{
// normal. someone stopped the request.
}
catch(Exception e)
{
addExceptionAndCloseResponse(e);
}
}
private void addExceptionAndCloseResponse(Exception e)
{
try
{
response.add(ErrorResponder.makeExceptionString(e));
response.closeAll();
}
catch(Exception e1)
{
}
}
```
Replace the temptation to create noise with the determination to clean your code. You’ll find it makes you a better and happier programmer.
> 用整理代码的决心替代创造废话的冲动吧。你会发现自己成为更优秀、更快乐的程序员。
### 4.4.7 Scary Noise 可怕的废话
Javadocs can also be noisy. What purpose do the following Javadocs (from a well-known open-source library) serve? Answer: nothing. They are just redundant noisy comments written out of some misplaced desire to provide documentation.
> Javadoc 也可能是废话。下列 Javadoc(来自某知名开源库)的目的是什么?答案:无。它们只是源自某种提供文档的不当愿望的废话注释。
```java
/** The name. */
private String name;
/** The version. */
private String version;
/** The licenceName. */
private String licenceName;
/** The version. */
private String info;
```
Read these comments again more carefully. Do you see the cut-paste error? If authors aren’t paying attention when comments are written (or pasted), why should readers be expected to profit from them?
> 再仔细读读这些注释。你是否发现了剪切-粘贴错误?如果作者在写(或粘贴)注释时都没花心思,怎么能指望读者从中获益呢?
### 4.4.8 Don’t Use a Comment When You Can Use a Function or a Variable 能用函数或变量时就别用注释
Consider the following stretch of code:
> 看看以下代码概要:
```java
// does the module from the global list depend on the
// subsystem we are part of?
if (smodule.getDependSubsystems().contains(subSysMod.getSubSystem()))
```
This could be rephrased without the comment as
> 可以改成以下没有注释的版本:
```java
ArrayList moduleDependees = smodule.getDependSubsystems();
String ourSubSystem = subSysMod.getSubSystem();
if (moduleDependees.contains(ourSubSystem))
```
The author of the original code may have written the comment first (unlikely) and then written the code to fulfill the comment. However, the author should then have refactored the code, as I did, so that the comment could be removed.
> 代码原作者可能(不太像)是先写注释再编写代码。不过,作者应该重构代码,如我所做的那样,从而删掉注释。
### 4.4.9 Position Markers 位置标记
Sometimes programmers like to mark a particular position in a source file. For example, I recently found this in a program I was looking through:
> 有时,程序员喜欢在源代码中标记某个特别位置。例如,最近我在程序中看到这样一行:
```java
// Actions //////////////////////////////////
```
There are rare times when it makes sense to gather certain functions together beneath a banner like this. But in general they are clutter that should be eliminated—especially the noisy train of slashes at the end.
> 把特定函数趸放在这种标记栏下面,多数时候实属无理。鸡零狗碎,理当删除——特别是尾部那一长串无用的斜杠。
Think of it this way. A banner is startling and obvious if you don’t see banners very often. So use them very sparingly, and only when the benefit is significant. If you overuse banners, they’ll fall into the background noise and be ignored.
> 这么说吧。如果标记栏不多,就会显而易见。所以,尽量少用标记栏,只在特别有价值的时候用。如果滥用标记栏,就会沉没在背景噪音中,被忽略掉。
### 4.4.10 Closing Brace Comments 括号后面的注释
Sometimes programmers will put special comments on closing braces, as in Listing 4-6. Although this might make sense for long functions with deeply nested structures, it serves only to clutter the kind of small and encapsulated functions that we prefer. So if you find yourself wanting to mark your closing braces, try to shorten your functions instead.
> 有时,程序员会在括号后面放置特殊的注释,如代码清单 4-6 所示。尽管这对于含有深度嵌套结构的长函数可能有意义,但只会给我们更愿意编写的短小、封装的函数带来混乱。如果你发现自己想标记右括号,其实应该做的是缩短函数。
Listing 4-6 wc.java
> 代码清单 4-6 wc.java
```java
public class wc {
public static void main(String[] args) {
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String line;
int lineCount = 0;
int charCount = 0;
int wordCount = 0;
try {
while ((line = in.readLine()) != null) {
lineCount++;
charCount += line.length();
String words[] = line.split("\\W");
wordCount += words.length;
} //while
System.out.println("wordCount = " + wordCount);
System.out.println("lineCount = " + lineCount);
System.out.println("charCount = " + charCount);
} // try
catch (IOException e) {
System.err.println("Error:" + e.getMessage());
} //catch
} //main
}
```
### 4.4.11 Attributions and Bylines 归属与署名
```java
/* Added by Rick */
```
Source code control systems are very good at remembering who added what, when. There is no need to pollute the code with little bylines. You might think that such comments would be useful in order to help others know who to talk to about the code. But the reality is that they tend to stay around for years and years, getting less and less accurate and relevant.
> 源代码控制系统非常善于记住是谁在何时添加了什么。没必要用那些小小的签名搞脏代码。你也许会认为,这种注释大概有助于他人了解应该和谁讨论这段代码。不过,事实却是注释在那儿放了一年又一年,越来越不准确,越来越和原作者没关系。
Again, the source code control system is a better place for this kind of information.
> 重申一下,源代码控制系统是这类信息最好的归属地。
### 4.4.12 Commented-Out Code 注释掉的代码
Few practices are as odious as commenting-out code. Don’t do this!
> 直接把代码注释掉是讨厌的做法。别这么干!
```java
InputStreamResponse response = new InputStreamResponse();
response.setBody(formatter.getResultStream(), formatter.getByteCount());
// InputStream resultsStream = formatter.getResultStream();
// StreamReader reader = new StreamReader(resultsStream);
// response.setContent(reader.read(formatter.getByteCount()));
```
Others who see that commented-out code won’t have the courage to delete it. They’ll think it is there for a reason and is too important to delete. So commented-out code gathers like dregs at the bottom of a bad bottle of wine.
> 其他人不敢删除注释掉的代码。他们会想,代码依然放在那儿,一定有其原因,而且这段代码很重要,不能删除。注释掉的代码堆积在一起,就像破酒瓶底的渣滓一般。
Consider this from apache commons:
> 看看以下来自 Apache 公共库的代码:
```java
this.bytePos = writeBytes(pngIdBytes, 0);
//hdrPos = bytePos;
writeHeader();
writeResolution();
//dataPos = bytePos;
if (writeImageData()) {
writeEnd();
this.pngBytes = resizeByteArray(this.pngBytes, this.maxPos);
}
else {
this.pngBytes = null;
}
return this.pngBytes;
```
Why are those two lines of code commented? Are they important? Were they left as reminders for some imminent change? Or are they just cruft that someone commented-out years ago and has simply not bothered to clean up.
> 这两行代码为什么要注释掉?它们重要吗?它们搁在那儿,是为了给未来的修改做提示吗?或者,只是某人在多年以前注释掉、懒得清理的过时玩意?
There was a time, back in the sixties, when commenting-out code might have been useful. But we’ve had good source code control systems for a very long time now. Those systems will remember the code for us. We don’t have to comment it out any more. Just delete the code. We won’t lose it. Promise.
> 20 世纪 60 年代,曾经有那么一段时间,注释掉的代码可能有用。但我们已经拥有优良的源代码控制系统如此之久,这些系统可以为我们记住不要的代码。我们无需再用注释来标记,删掉即可,它们丢不了。我担保。
### 4.4.13 HTML Comments HTML 注释
HTML in source code comments is an abomination, as you can tell by reading the code below. It makes the comments hard to read in the one place where they should be easy to read—the editor/IDE. If comments are going to be extracted by some tool (like Javadoc) to appear in a Web page, then it should be the responsibility of that tool, and not the programmer, to adorn the comments with appropriate HTML.
> 源代码注释中的 HTML 标记是一种厌物,如你在下面代码中所见。编辑器/IDE 中的代码本来易于阅读,却因为 HTML 注释的存在而变得难以卒读。如果注释将由某种工具(例如 Javadoc)抽取出来,呈现到网页,那么该是工具而非程序员来负责给注释加上合适的 HTML 标签。
```java
/**
* Task to run fit tests.
* This task runs fitnesse tests and publishes the results.
*
*
* Usage:
* <taskdef name="execute-fitnesse-tests"
* classname="fitnesse.ant.ExecuteFitnesseTestsTask"
* classpathref="classpath" />
* OR
* <taskdef classpathref="classpath"
* resource="tasks.properties" />
*
* <execute-fitnesse-tests
* suitepage="FitNesse.SuiteAcceptanceTests"
* fitnesseport="8082"
* resultsdir="${results.dir}"
* resultshtmlpage="fit-results.html"
* classpathref="classpath" />
*
*/
```
### 4.4.14 Nonlocal Information 非本地信息
If you must write a comment, then make sure it describes the code it appears near. Don’t offer systemwide information in the context of a local comment. Consider, for example, the javadoc comment below. Aside from the fact that it is horribly redundant, it also offers information about the default port. And yet the function has absolutely no control over what that default is. The comment is not describing the function, but some other, far distant part of the system. Of course there is no guarantee that this comment will be changed when the code containing the default is changed.
> 假如你一定要写注释,请确保它描述了离它最近的代码。别在本地注释的上下文环境中给出系统级的信息。以下面的 Javadoc 注释为例,除了那可怕的冗余之外,它还给出了有关默认端口的信息。不过该函数完全没控制到那个所谓默认值。这个注释并未描述该函数,而是在描述系统中远在他方的其他函数。当然,也无法担保在包含那个默认值的代码修改之后,这里的注释也会跟着修改。
```java
/**
* Port on which fitnesse would run. Defaults to 8082.
*
* @param fitnessePort
*/
public void setFitnessePort(int fitnessePort)
{
this.fitnessePort = fitnessePort;
}
```
### 4.4.15 Too Much Information 信息过多
Don’t put interesting historical discussions or irrelevant descriptions of details into your comments. The comment below was extracted from a module designed to test that a function could encode and decode base64. Other than the RFC number, someone reading this code has no need for the arcane information contained in the comment.
> 别在注释中添加有趣的历史性话题或者无关的细节描述。下列注释来自某个用来测试 base64 编解码函数的模块。除了 RFC 文档编号之外,注释中的其他细节信息对于读者完全没有必要。
```java
/*
RFC 2045 - Multipurpose Internet Mail Extensions (MIME)
Part One: Format of Internet Message Bodies
section 6.8. Base64 Content-Transfer-Encoding
The encoding process represents 24-bit groups of input bits as output
strings of 4 encoded characters. Proceeding from left to right, a
24-bit input group is formed by concatenating 3 8-bit input groups.
These 24 bits are then treated as 4 concatenated 6-bit groups, each
of which is translated into a single digit in the base64 alphabet.
When encoding a bit stream via the base64 encoding, the bit stream
must be presumed to be ordered with the most-significant-bit first.
That is, the first bit in the stream will be the high-order bit in
the first 8-bit byte, and the eighth bit will be the low-order bit in
the first 8-bit byte, and so on.
*/
```
### 4.4.16 Inobvious Connection 不明显的联系
The connection between a comment and the code it describes should be obvious. If you are going to the trouble to write a comment, then at least you’d like the reader to be able to look at the comment and the code and understand what the comment is talking about.
> 注释及其描述的代码之间的联系应该显而易见。如果你不嫌麻烦要写注释,至少让读者能看着注释和代码,并且理解注释所谈何物。
Consider, for example, this comment drawn from apache commons:
> 以来自 Apache 公共库的这段注释为例:
```java
/*
* start with an array that is big enough to hold all the pixels
* (plus filter bytes), and an extra 200 bytes for header info
*/
this.pngBytes = new byte[((this.width + 1) * this.height * 3) + 200];
```
What is a filter byte? Does it relate to the +1? Or to the \*3? Both? Is a pixel a byte? Why 200? The purpose of a comment is to explain code that does not explain itself. It is a pity when a comment needs its own explanation.
> 过滤器字节是什么?与那个+1 有关系吗?或与\*3 有关?还是与两者皆有关?为什么用 200?注释的作用是解释未能自行解释的代码。如果注释本身还需要解释,就太遗憾了。
### 4.4.17 Function Headers 函数头
Short functions don’t need much description. A well-chosen name for a small function that does one thing is usually better than a comment header.
> 短函数不需要太多描述。为只做一件事的短函数选个好名字,通常要比写函数头注释要好。
### 4.4.18 Javadocs in Nonpublic Code 非公共代码中的 Javadoc
As useful as javadocs are for public APIs, they are anathema to code that is not intended for public consumption. Generating javadoc pages for the classes and functions inside a system is not generally useful, and the extra formality of the javadoc comments amounts to little more than cruft and distraction.
> 虽然 Javadoc 对于公共 API 非常有用,但对于不打算作公共用途的代码就令人厌恶了。为系统中的类和函数生成 Javadoc 页并非总有用,而 Javadoc 注释额外的形式要求几乎等同于八股文章。
### 4.4.19 Example 范例
I wrote the module in Listing 4-7 for the first XP Immersion. It was intended to be an example of bad coding and commenting style. Kent Beck then refactored this code into a much more pleasant form in front of several dozen enthusiastic students. Later I adapted the example for my book Agile Software Development, Principles, Patterns, and Practices and the first of my Craftsman articles published in Software Development magazine.
> 我曾为首个 XP Immersion[4]课程编写了代码清单 4-7 列出的模块。这个模块几乎是糟糕的代码和坏注释风格的典范。后来 Kent Beck 当着几十位满腔热情的学生的面重构了这些代码,将其变得令人愉悦。后来,我在拙著 Agile Software Development,Principles,Patterns,and Practices(中译版《敏捷软件开发:原则、模式与实践》)和 Software Development(软件开发)杂志的“技艺”专栏的第一篇文章中引用了这个例子。
What I find fascinating about this module is that there was a time when many of us would have considered it “well documented.” Now we see it as a small mess. See how many different comment problems you can find.
> 这个模块最迷人的地方是,有那么一阵,我们中的许多人都认为它“文档做得很好”。如今,我们认为它是一小团乱麻。看看你能发现多少个不同的注释问题吧。
Listing 4-7 GeneratePrimes.java
> 代码清单 4-7 GeneratePrimes.java
```java
/**
* This class Generates prime numbers up to a user specified
* maximum. The algorithm used is the Sieve of Eratosthenes.
*
* Eratosthenes of Cyrene, b. c. 276 BC, Cyrene, Libya --
* d. c. 194, Alexandria. The first man to calculate the
* circumference of the Earth. Also known for working on
* calendars with leap years and ran the library at Alexandria.
*
* The algorithm is quite simple. Given an array of integers
* starting at 2. Cross out all multiples of 2. Find the next
* uncrossed integer, and cross out all of its multiples.
* Repeat untilyou have passed the square root of the maximum
* value.
*
* @author Alphonse
* @version 13 Feb 2002 atp
*/
import java.util.*;
public class GeneratePrimes
{
/**
* @param maxValue is the generation limit.
*/
public static int[] generatePrimes(int maxValue)
{
if (maxValue >= 2) // the only valid case
{
// declarations
int s = maxValue + 1; // size of array
boolean[] f = new boolean[s];
int i;
// initialize array to true.
for (i = 0; i < s; i++)
f[i] = true;
// get rid of known non-primes
f[0] = f[1] = false;
// sieve
int j;
for (i = 2; i < Math.sqrt(s) + 1; i++)
{
if (f[i]) // if i is uncrossed, cross its multiples.
{
for (j = 2 * i; j < s; j += i)
f[j] = false; // multiple is not prime
}
}
// how many primes are there?
int count = 0;
for (i = 0; i < s; i++)
{
if (f[i])
count++; // bump count.
}
int[] primes = new int[count];
// move the primes into the result
for (i = 0, j = 0; i < s; i++)
{
if (f[i]) // if prime
primes[j++] = i;
}
return primes; // return the primes
}
else // maxValue < 2
return new int[0]; // return null array if bad input.
}
}
```
In Listing 4-8 you can see a refactored version of the same module. Note that the use of comments is significantly restrained. There are just two comments in the whole module. Both comments are explanatory in nature.
> 在代码清单 4-8 中,你可以看到该模块重构后的版本。注意,注释的使用被明显地限制了。在整个模块中只有两个注释。每个注释都足具说明意义。
Listing 4-8 PrimeGenerator.java (refactored)
> 代码清单 4-8 PrimeGenerator.java(重构后)
```java
/**
* This class Generates prime numbers up to a user specified
* maximum. The algorithm used is the Sieve of Eratosthenes.
* Given an array of integers starting at 2:
* Find the first uncrossed integer, and cross out all its
* multiples. Repeat until there are no more multiples
* in the array.
*/
public class PrimeGenerator
{
private static boolean[] crossedOut;
private static int[] result;
public static int[] generatePrimes(int maxValue)
{
if (maxValue < 2)
return new int[0];
else
{
uncrossIntegersUpTo(maxValue);
crossOutMultiples();
putUncrossedIntegersIntoResult();
return result;
}
}
private static void uncrossIntegersUpTo(int maxValue)
{
crossedOut = new boolean[maxValue + 1];
for (int i = 2; i < crossedOut.length; i++)
crossedOut[i] = false;
}
private static void crossOutMultiples()
{
int limit = determineIterationLimit();
for (int i = 2; i <= limit; i++)
if (notCrossed(i))
crossOutMultiplesOf(i);
}
private static int determineIterationLimit()
{
// Every multiple in the array has a prime factor that
// is less than or equal to the root of the array size,
// so we don’t have to cross out multiples of numbers
// larger than that root.
double iterationLimit = Math.sqrt(crossedOut.length);
return (int) iterationLimit;
}
private static void crossOutMultiplesOf(int i)
{
for (int multiple = 2*i;
multiple < crossedOut.length;
multiple += i)
crossedOut[multiple] = true;
}
private static boolean notCrossed(int i)
{
return crossedOut[i] == false;
}
private static void putUncrossedIntegersIntoResult()
{
result = new int[numberOfUncrossedIntegers()];
for (int j = 0, i = 2; i < crossedOut.length; i++)
if (notCrossed(i))
result[j++] = i;
}
private static int numberOfUncrossedIntegers()
{
int count = 0;
for (int i = 2; i < crossedOut.length; i++)
if (notCrossed(i))
count++;
return count;
}
}
```
It is easy to argue that the first comment is redundant because it reads very much like the generatePrimes function itself. Still, I think the comment serves to ease the reader into the algorithm, so I’m inclined to leave it.
> 很容易说明,第一个注释完全是多余的,因为它读起来非常像是 generatePrimes 函数自身。不过,我认为这段注释还是省了读者去读具体算法的精力,所以我倾向于留下它。
The second argument is almost certainly necessary. It explains the rationale behind the use of the square root as the loop limit. I could find no simple variable name, nor any different coding structure that made this point clear. On the other hand, the use of the square root might be a conceit. Am I really saving that much time by limiting the iteration to the square root? Could the calculation of the square root take more time than I’m saving?
> 第二个注释显然很有必要。它解释了平方根作为循环限制的理由。我找不到能说明白这个问题的简单变量名或者其他编程结构。另外,对平方根的使用可能也有点武断。通过限制平方根循环,我是否真节省了许多时间?平方根计算所花的时间会不会比省下的时间还要多?
It’s worth thinking about. Using the square root as the iteration limit satisfies the old C and assembly language hacker in me, but I’m not convinced it’s worth the time and effort that everyone else will expend to understand it.
> 这些都值得考虑。使用平方根作为循环限制,满足了我这种旧式 C 语言和汇编语言黑客,不过我可不敢说抵得上其他人为理解它而花的时间和精力。
================================================
FILE: docs/ch5.md
================================================
# 第 5 章 Formatting 格式

When people look under the hood, we want them to be impressed with the neatness, consistency, and attention to detail that they perceive. We want them to be struck by the orderliness. We want their eyebrows to rise as they scroll through the modules. We want them to perceive that professionals have been at work. If instead they see a scrambled mass of code that looks like it was written by a bevy of drunken sailors, then they are likely to conclude that the same inattention to detail pervades every other aspect of the project.
> 当有人查看底层代码实现时,我们希望他们为其整洁、一致及所感知到的对细节的关注而震惊。我们希望他们高高扬起眉毛,一路看下去。我们希望他们感受到那些为之劳作的专业人士们。但若他们看到的只是一堆像是由酒醉的水手写出的鬼画符,那他们多半会得出结论,认为项目其他任何部分也同样对细节漠不关心。
You should take care that your code is nicely formatted. You should choose a set of simple rules that govern the format of your code, and then you should consistently apply those rules. If you are working on a team, then the team should agree to a single set of formatting rules and all members should comply. It helps to have an automated tool that can apply those formatting rules for you.
> 当有人查看底层代码实现时,我们希望他们为其整洁、一致及所感知到的对细节的关注而震惊。我们希望他们高高扬起眉毛,一路看下去。我们希望他们感受到那些为之劳作的专业人士们。但若他们看到的只是一堆像是由酒醉的水手写出的鬼画符,那他们多半会得出结论,认为项目其他任何部分也同样对细节漠不关心。
## 5.1 THE PURPOSE OF FORMATTING 格式的目的
First of all, let’s be clear. Code formatting is important. It is too important to ignore and it is too important to treat religiously. Code formatting is about communication, and communication is the professional developer’s first order of business.
> 先明确一下,代码格式很重要。代码格式不可忽略,必须严肃对待。代码格式关乎沟通,而沟通是专业开发者的头等大事。
Perhaps you thought that “getting it working” was the first order of business for a professional developer. I hope by now, however, that this book has disabused you of that idea. The functionality that you create today has a good chance of changing in the next release, but the readability of your code will have a profound effect on all the changes that will ever be made. The coding style and readability set precedents that continue to affect maintainability and extensibility long after the original code has been changed beyond recognition. Your style and discipline survives, even though your code does not.
> 或许你认为“让代码能工作”才是专业开发者的头等大事。然而,我希望本书能让你抛掉那种想法。你今天编写的功能,极有可能在下一版本中被修改,但代码的可读性却会对以后可能发生的修改行为产生深远影响。原始代码修改之后很久,其代码风格和可读性仍会影响到可维护性和扩展性。即便代码已不复存在,你的风格和律条仍存活下来。
So what are the formatting issues that help us to communicate best?
> 那么,哪些代码格式相关方面能帮我们最好地沟通呢?
## 5.2 VERTICAL FORMATTING 垂直格式
Let’s start with vertical size. How big should a source file be? In Java, file size is closely related to class size. We’ll talk about class size when we talk about classes. For the moment let’s just consider file size.
> 从垂直尺寸开始吧。源代码文件该有多大?在 Java 中,文件尺寸与类尺寸极其相关。讨论类时再说类的尺寸。现在先考虑文件尺寸。
How big are most Java source files? It turns out that there is a huge range of sizes and some remarkable differences in style. Figure 5-1 shows some of those differences.
> 多数 Java 源代码文件有多大?事实说明,尺寸各各不同,长度殊异,如图 5-1 所示。
Seven different projects are depicted. Junit, FitNesse, testNG, Time and Money, JDepend, Ant, and Tomcat. The lines through the boxes show the minimum and maximum file lengths in each project. The box shows approximately one-third (one standard deviation1) of the files. The middle of the box is the mean. So the average file size in the FitNesse project is about 65 lines, and about one-third of the files are between 40 and 100+ lines. The largest file in FitNesse is about 400 lines and the smallest is 6 lines. Note that this is a log scale, so the small difference in vertical position implies a very large difference in absolute size.
> 图 5-1 中涉及 7 个不同项目:Junit、FitNesse、testNG、Time and Money、JDepend、Ant 和 Tomcat。贯穿方块的直线两端显示这些项目中最小和最大的文件长度。方块表示在平均值以上或以下的大约三分之一文件(一个标准偏差[1])的长度。方块中间位置就是平均数。所以 FitNesse 项目的文件平均尺寸是 65 行,而上面三分之一在 40 ~ 100 行及 100 行以上之间。FitNesse 中最大的文件大约 400 行,最小是 6 行。这是个对数标尺,所以较小的垂直位置差异意味着文件绝对尺寸的较大差异。
1. The box shows sigma/2 above and below the mean. Yes, I know that the file length distribution is not normal, and so the standard deviation is not mathematically precise. But we’re not trying for precision here. We’re just trying to get a feel.
Figure 5-1 File length distributions LOG scale (box height = sigma)

Junit, FitNesse, and Time and Money are composed of relatively small files. None are over 500 lines and most of those files are less than 200 lines. Tomcat and Ant, on the other hand, have some files that are several thousand lines long and close to half are over 200 lines.
> Junit、FitNesse 和 Time and Money 由相对较小的文件组成。没有一个超过 500 行,多数都小于 200 行。Tomcat 和 Ant 则有些文件达到数千行,将近一半文件长于 200 行。
What does that mean to us? It appears to be possible to build significant systems (FitNesse is close to 50,000 lines) out of files that are typically 200 lines long, with an upper limit of 500. Although this should not be a hard and fast rule, it should be considered very desirable. Small files are usually easier to understand than large files are.
> 对我们来说,这意味着什么?意味着有可能用大多数为 200 行、最长 500 行的单个文件构造出色的系统(FitNesse 总长约 50000 行)。尽管这并非不可违背的原则,也应该乐于接受。短文件通常比长文件易于理解。
### 5.2.1 The Newspaper Metaphor 向报纸学习
Think of a well-written newspaper article. You read it vertically. At the top you expect a headline that will tell you what the story is about and allows you to decide whether it is something you want to read. The first paragraph gives you a synopsis of the whole story, hiding all the details while giving you the broad-brush concepts. As you continue downward, the details increase until you have all the dates, names, quotes, claims, and other minutia.
> 想想看写得很好的报纸文章。你从上到下阅读。在顶部,你期望有个头条,告诉你故事主题,好让你决定是否要读下去。第一段是整个故事的大纲,给出粗线条概述,但隐藏了故事细节。接着读下去,细节渐次增加,直至你了解所有的日期、名字、引语、说法及其他细节。
We would like a source file to be like a newspaper article. The name should be simple but explanatory. The name, by itself, should be sufficient to tell us whether we are in the right module or not. The topmost parts of the source file should provide the high-level concepts and algorithms. Detail should increase as we move downward, until at the end we find the lowest level functions and details in the source file.
> 源文件也要像报纸文章那样。名称应当简单且一目了然。名称本身应该足够告诉我们是否在正确的模块中。源文件最顶部应该给出高层次概念和算法。细节应该往下渐次展开,直至找到源文件中最底层的函数和细节。
A newspaper is composed of many articles; most are very small. Some are a bit larger. Very few contain as much text as a page can hold. This makes the newspaper usable. If the newspaper were just one long story containing a disorganized agglomeration of facts, dates, and names, then we simply would not read it.
> 报纸由许多篇文章组成;多数短小精悍。有些稍微长点儿。很少有占满一整页的。这样做,报纸才可用。假若一份报纸只登载一篇长故事,其中充斥毫无组织的事实、日期、名字等,没人会去读它。
### 5.2.2 Vertical Openness Between Concepts 概念间垂直方向上的区隔
Nearly all code is read left to right and top to bottom. Each line represents an expression or a clause, and each group of lines represents a complete thought. Those thoughts should be separated from each other with blank lines.
> 几乎所有的代码都是从上往下读,从左往右读。每行展现一个表达式或一个子句,每组代码行展示一条完整的思路。这些思路用空白行区隔开来。
Consider, for example, Listing 5-1. There are blank lines that separate the package declaration, the import(s), and each of the functions. This extremely simple rule has a profound effect on the visual layout of the code. Each blank line is a visual cue that identifies a new and separate concept. As you scan down the listing, your eye is drawn to the first line that follows a blank line.
> 以代码清单 5-1 为例。在封包声明、导入声明和每个函数之间,都有空白行隔开。这条极其简单的规则极大地影响到代码的视觉外观。每个空白行都是一条线索,标识出新的独立概念。往下读代码时,你的目光总会停留于空白行之后那一行。
Listing 5-1 BoldWidget.java
> 代码清单 5-1 BoldWidget.java
```java
package fitnesse.wikitext.widgets;
import java.util.regex.*;
public class BoldWidget extends ParentWidget {
public static final String REGEXP = “’’’.+?’’’”;
private static final Pattern pattern = Pattern.compile(“’’’(.+?)’’’”,
Pattern.MULTILINE + Pattern.DOTALL
);
public BoldWidget(ParentWidget parent, String text) throws Exception {
super(parent);
Matcher match = pattern.matcher(text);
match.find();
addChildWidgets(match.group(1));
}
public String render() throws Exception {
StringBuffer html = new StringBuffer(“”);
html.append(childHtml()).append(“”);
return html.toString();
}
}
```
Taking those blank lines out, as in Listing 5-2, has a remarkably obscuring effect on the readability of the code.
> 如代码清单 5-2 所示,抽掉这些空白行,代码可读性减弱了不少。
Listing 5-2 BoldWidget.java
> 代码清单 5-2 BoldWidget.java
```java
package fitnesse.wikitext.widgets;
import java.util.regex.*;
public class BoldWidget extends ParentWidget {
public static final String REGEXP = “’’’.+?’’’”;
private static final Pattern pattern = Pattern.compile(“’’’(.+?)’’’”,
Pattern.MULTILINE + Pattern.DOTALL);
public BoldWidget(ParentWidget parent, String text) throws Exception {
super(parent);
Matcher match = pattern.matcher(text);
match.find();
addChildWidgets(match.group(1));}
public String render() throws Exception {
StringBuffer html = new StringBuffer(“”);
html.append(childHtml()).append(“”);
return html.toString();
}
}
```
This effect is even more pronounced when you unfocus your eyes. In the first example the different groupings of lines pop out at you, whereas the second example looks like a muddle. The difference between these two listings is a bit of vertical openness.
> 在你不特意注视时,后果就更严重了。在第一个例子中,代码组会跳到你眼中,而第二个例子就像一堆乱麻。两段代码的区别,展示了垂直方向上区隔的作用。
### 5.2.3 Vertical Density 垂直方向上的靠近
If openness separates concepts, then vertical density implies close association. So lines of code that are tightly related should appear vertically dense. Notice how the useless comments in Listing 5-3 break the close association of the two instance variables.
> 如果说空白行隔开了概念,靠近的代码行则暗示了它们之间的紧密关系。所以,紧密相关的代码应该互相靠近。注意代码清单 5-3 中的注释是如何割断两个实体变量间的联系的。
Listing 5-3
> 代码清单 5-3
```java
public class ReporterConfig {
/**
* The class name of the reporter listener
*/
private String m_className;
/**
* The properties of the reporter listener
*/
private List m_properties = new ArrayList();
public void addProperty(Property property) {
m_properties.add(property);
}
```
Listing 5-4 is much easier to read. It fits in an “eye-full,” or at least it does for me. I can look at it and see that this is a class with two variables and a method, without having to move my head or eyes much. The previous listing forces me to use much more eye and head motion to achieve the same level of comprehension.
> 代码清单 5-4 更易于阅读。它刚好“一览无遗”,至少对我来说是这样。我一眼就能看到,这是个有两个变量和一个方法的类。看上面的代码时,我不得不更多地移动头部和眼球,才能获得相同的理解度。
Listing 5-4
> 代码清单 5-4
```java
public class ReporterConfig {
private String m_className;
private List m_properties = new ArrayList();
public void addProperty(Property property) {
m_properties.add(property);
}
```
### 5.2.4 Vertical Distance 垂直距离
Have you ever chased your tail through a class, hopping from one function to the next, scrolling up and down the source file, trying to divine how the functions relate and operate, only to get lost in a rat’s nest of confusion? Have you ever hunted up the chain of inheritance for the definition of a variable or function? This is frustrating because you are trying to understand what the system does, but you are spending your time and mental energy on trying to locate and remember where the pieces are.
> 你是否曾经在某个类中摸索,从一个函数跳到另一个函数,上下求索,想要弄清楚这些函数如何操作、如何互相相关,最后却被搞糊涂了?你是否曾经苦苦追索某个变量或函数的继承链条?这让人沮丧,因为你想要理解系统做什么,但却花时间和精力在找到和记住那些代码碎片在哪里。
Concepts that are closely related should be kept vertically close to each other [G10]. Clearly this rule doesn’t work for concepts that belong in separate files. But then closely related concepts should not be separated into different files unless you have a very good reason. Indeed, this is one of the reasons that protected variables should be avoided.
> 关系密切的概念应该互相靠近[G10]。显然,这条规则并不适用于分布在不同文件中的概念。除非有很好的理由,否则就不要把关系密切的概念放到不同的文件中。实际上,这也是避免使用 protected 变量的理由之一。
For those concepts that are so closely related that they belong in the same source file, their vertical separation should be a measure of how important each is to the understandability of the other. We want to avoid forcing our readers to hop around through our source files and classes.
> 对于那些关系密切、放置于同一源文件中的概念,它们之间的区隔应该成为对相互的易懂度有多重要的衡量标准。应避免迫使读者在源文件和类中跳来跳去。
Variable Declarations. Variables should be declared as close to their usage as possible. Because our functions are very short, local variables should appear a the top of each function, as in this longish function from Junit4.3.1.
> 变量声明。变量声明应尽可能靠近其使用位置。因为函数很短,本地变量应该在函数的顶部出现,就像 Junit4.3.1 中这个稍长的函数中那样。
```java
private static void readPreferences() {
InputStream is= null;
try {
is= new FileInputStream(getPreferencesFile());
setPreferences(new Properties(getPreferences()));
getPreferences().load(is);
} catch (IOException e) {
try {
if (is != null)
is.close();
} catch (IOException e1) {
}
}
}
```
Control variables for loops should usually be declared within the loop statement, as in this cute little function from the same source.
> 对于那些关系密切、放置于同一源文件中的概念,它们之间的区隔应该成为对相互的易懂度有多重要的衡量标准。应避免迫使读者在源文件和类中跳来跳去。
```java
public int countTestCases() {
int count= 0;
for (Test each : tests)
count += each.countTestCases();
return count;
}
```
In rare cases a variable might be declared at the top of a block or just before a loop in a long-ish function. You can see such a variable in this snippet from the midst of a very long function in TestNG.
> 偶尔,在较长的函数中,变量也可能在某个代码块顶部,或在循环之前声明。你可以在以下摘自 TestNG 中一个长函数的代码片段中找到类似的变量。
```java
…
for (XmlTest test : m_suite.getTests()) {
TestRunner tr = m_runnerFactory.newTestRunner(this, test);
tr.addListener(m_textReporter);
m_testRunners.add(tr);
invoker = tr.getInvoker();
for (ITestNGMethod m : tr.getBeforeSuiteMethods()) {
beforeSuiteMethods.put(m.getMethod(), m);
}
for (ITestNGMethod m : tr.getAfterSuiteMethods()) {
afterSuiteMethods.put(m.getMethod(), m);
}
}
…
```
Instance variables, on the other hand, should be declared at the top of the class. This should not increase the vertical distance of these variables, because in a well-designed class, they are used by many, if not all, of the methods of the class.
> 实体变量应该在类的顶部声明。这应该不会增加变量的垂直距离,因为在设计良好的类中,它们如果不是被该类的所有方法也是被大多数方法所用。
There have been many debates over where instance variables should go. In C++ we commonly practiced the so-called scissors rule, which put all the instance variables at the bottom. The common convention in Java, however, is to put them all at the top of the class. I see no reason to follow any other convention. The important thing is for the instance variables to be declared in one well-known place. Everybody should know where to go to see the declarations.
> 关于实体变量应该放在哪里,争论不断。在 C++中,通常会采用所谓“剪刀原则”(scissors rule),所有实体变量都放在底部。而在 Java 中,惯例是放在类的顶部。没理由去遵循其他惯例。重点是在谁都知道的地方声明实体变量。大家都应该知道在哪儿能看到这些声明。
Consider, for example, the strange case of the TestSuite class in JUnit 4.3.1. I have greatly attenuated this class to make the point. If you look about halfway down the listing, you will see two instance variables declared there. It would be hard to hide them in a better place. Someone reading this code would have to stumble across the declarations by accident (as I did).
> 例如 JUnit 4.3.1 中的这个奇怪情形。我极力删减了这个类,好说明问题。如果你看到代码清单大致一半的位置,会看到在那里声明了两个实体变量。如果放在更好的位置,它们就会更明显。而现在,读代码者只能在无意中看到这些声明(就像我一样)。
```java
public class TestSuite implements Test {
static public Test createTest(Class extends TestCase> theClass,
String name) {
…
}
public static Constructor extends TestCase>
getTestConstructor(Class extends TestCase> theClass)
throws NoSuchMethodException {
…
}
public static Test warning(final String message) {
…
}
private static String exceptionToString(Throwable t) {
…
}
private String fName;
private Vector fTests= new Vector(10);
public TestSuite() {
}
public TestSuite(final Class extends TestCase> theClass) {
…
}
public TestSuite(Class extends TestCase> theClass, String name) {
…
}
… … … … …
}
```
Dependent Functions. If one function calls another, they should be vertically close, and the caller should be above the callee, if at all possible. This gives the program a natural flow. If the convention is followed reliably, readers will be able to trust that function definitions will follow shortly after their use. Consider, for example, the snippet from FitNesse in Listing 5-5. Notice how the topmost function calls those below it and how they in turn call those below them. This makes it easy to find the called functions and greatly enhances the readability of the whole module.
> 相关函数。若某个函数调用了另外一个,就应该把它们放到一起,而且调用者应该尽可能放在被调用者上面。这样,程序就有个自然的顺序。若坚定地遵循这条约定,读者将能够确信函数声明总会在其调用后很快出现。以源自 FitNesse 的代码清单 5-5 为例。注意顶部的函数是如何调用其下的函数,而这些被调用的函数又是如何调用更下面的函数的。这样就能轻易找到被调用的函数,极大地增强了整个模块的可读性。
Listing 5-5 WikiPageResponder.java
> 代码清单 5-5 WikiPageResponder.java
```java
public class WikiPageResponder implements SecureResponder {
protected WikiPage page;
protected PageData pageData;
protected String pageTitle;
protected Request request;
protected PageCrawler crawler;
public Response makeResponse(FitNesseContext context, Request request)
throws Exception {
String pageName = getPageNameOrDefault(request, “FrontPage”);
loadPage(pageName, context);
if (page == null)
return notFoundResponse(context, request);
else
return makePageResponse(context);
}
private String getPageNameOrDefault(Request request, String defaultPageName)
{
String pageName = request.getResource();
if (StringUtil.isBlank(pageName))
pageName = defaultPageName;
return pageName;
}
protected void loadPage(String resource, FitNesseContext context)
throws Exception {
WikiPagePath path = PathParser.parse(resource);
crawler = context.root.getPageCrawler();
crawler.setDeadEndStrategy(new VirtualEnabledPageCrawler());
page = crawler.getPage(context.root, path);
if (page != null)
pageData = page.getData();
}
private Response notFoundResponse(FitNesseContext context, Request request)
throws Exception {
return new NotFoundResponder().makeResponse(context, request);
}
private SimpleResponse makePageResponse(FitNesseContext context)
throws Exception {
pageTitle = PathParser.render(crawler.getFullPath(page));
String html = makeHtml(context);
SimpleResponse response = new SimpleResponse();
response.setMaxAge(0);
response.setContent(html);
return response;
}
…
```
As an aside, this snippet provides a nice example of keeping constants at the appropriate level [G35]. The “FrontPage” constant could have been buried in the getPageNameOrDefault function, but that would have hidden a well-known and expected constant in an inappropriately low-level function. It was better to pass that constant down from the place where it makes sense to know it to the place that actually uses it.
> 说句题外话,以上代码片段也是把常量保持在恰当级别的好例子[G35]。FrontPage 常量可以埋在 getPageNameOrDefault 函数中,但那样就会把一个众人皆知的常量埋藏到位于不太合适的底层函数中。更好的做法是把它放在易于找到的位置,然后再传递到真实使用的位置。
Conceptual Affinity. Certain bits of code want to be near other bits. They have a certain conceptual affinity. The stronger that affinity, the less vertical distance there should be between them.
> 概念相关。概念相关的代码应该放到一起。相关性越强,彼此之间的距离就该越短。
As we have seen, this affinity might be based on a direct dependence, such as one function calling another, or a function using a variable. But there are other possible causes of affinity. Affinity might be caused because a group of functions perform a similar operation. Consider this snippet of code from Junit 4.3.1:
> 如上所述,相关性应建立在直接依赖的基础上,如函数间调用,或函数使用某个变量。但也有其他相关性的可能。相关性可能来自于执行相似操作的一组函数。请看以下来自 Junit 4.3.1 的代码片段:

```java
public class Assert {
static public void assertTrue(String message, boolean condition) {
if (!condition)
fail(message);
}
static public void assertTrue(boolean condition) {
assertTrue(null, condition);
}
static public void assertFalse(String message, boolean condition) {
assertTrue(message, !condition);
}
static public void assertFalse(boolean condition) {
assertFalse(null, condition);
}
…
```
These functions have a strong conceptual affinity because they share a common naming scheme and perform variations of the same basic task. The fact that they call each other is secondary. Even if they didn’t, they would still want to be close together.
这些函数有着极强的概念相关性,因为他们拥有共同的命名模式,执行同一基础任务的不同变种。互相调用是第二位的。即便没有互相调用,也应该放在一起。
### 5.2.5 Vertical Ordering 垂直顺序
In general we want function call dependencies to point in the downward direction. That is, a function that is called should be below a function that does the calling.2 This creates a nice flow down the source code module from high level to low level.
> 一般而言,我们想自上向下展示函数调用依赖顺序。也就是说,被调用的函数应该放在执行调用的函数下面[2]。这样就建立了一种自顶向下贯穿源代码模块的良好信息流。
2. This is the exact opposite of languages like Pascal, C, and C++ that enforce functions to be defined, or at least declared, before they are used.
As in newspaper articles, we expect the most important concepts to come first, and we expect them to be expressed with the least amount of polluting detail. We expect the low-level details to come last. This allows us to skim source files, getting the gist from the first few functions, without having to immerse ourselves in the details. Listing 5-5 is organized this way. Perhaps even better examples are Listing 15-5 on page 263, and Listing 3-7 on page 50.
> 像报纸文章一般,我们指望最重要的概念先出来,指望以包括最少细节的方式表述它们。我们指望底层细节最后出来。这样,我们就能扫过源代码文件,自最前面的几个函数获知要旨,而不至于沉溺到细节中。代码清单 5-5 就是如此组织的。或许,更好的例子是代码清单 15-5,及代码清单 3-7。
## 5.3 HORIZONTAL FORMATTING 横向格式
How wide should a line be? To answer that, let’s look at how wide lines are in typical programs. Again, we examine the seven different projects. Figure 5-2 shows the distribution of line lengths of all seven projects. The regularity is impressive, especially right around 45 characters. Indeed, every size from 20 to 60 represents about 1 percent of the total number of lines. That’s 40 percent! Perhaps another 30 percent are less than 10 characters wide. Remember this is a log scale, so the linear appearance of the drop-off above 80 characters is really very significant. Programmers clearly prefer short lines.
> 一行代码应该有多宽?要回答这个问题,来看看典型的程序中代码行的宽度。我们再一次检验 7 个不同项目。图 5-2 展示了这 7 个项目的代码行宽度分布情况。其中展现的规律性令人印象深刻,45 个字符左右的宽度分布尤为如此。其实,20 ~ 60 的每个尺寸,都代表全部代码行数的 1%。也就是总共 40%!或许其余 30%的代码行短于 10 个字符。记住,这是个对数标尺,所以图中长于 80 个字符部分的线性下降在实际情况中会极其可观。程序员们显然更喜爱短代码行。
Figure 5-2 Java line width distribution

This suggests that we should strive to keep our lines short. The old Hollerith limit of 80 is a bit arbitrary, and I’m not opposed to lines edging out to 100 or even 120. But beyond that is probably just careless.
> 这说明,应该尽力保持代码行短小。死守 80 个字符的上限有点僵化,而且我也并不反对代码行长度达到 100 个字符或 120 个字符。再多的话,大抵就是肆意妄为了。
I used to follow the rule that you should never have to scroll to the right. But monitors are too wide for that nowadays, and younger programmers can shrink the font so small that they can get 200 characters across the screen. Don’t do that. I personally set my limit at 120.
> 我一向遵循无需拖动滚动条到右边的原则。但近年来显示器越来越宽,而年轻程序员又能将显示字符缩小到如此程度,屏幕上甚至能容纳 200 个字符的宽度。别那么做。我个人的上限是 120 个字符。
### 5.3.1 Horizontal Openness and Density 水平方向上的区隔与靠近
We use horizontal white space to associate things that are strongly related and disassociate things that are more weakly related. Consider the following function:
> 我们使用空格字符将彼此紧密相关的事物连接到一起,也用空格字符把相关性较弱的事物分隔开。请看以下函数:
```java
private void measureLine(String line) {
lineCount++;
int lineSize = line.length();
totalChars += lineSize;
lineWidthHistogram.addLine(lineSize, lineCount);
recordWidestLine(lineSize);
}
```
I surrounded the assignment operators with white space to accentuate them. Assignment statements have two distinct and major elements: the left side and the right side. The spaces make that separation obvious.
> 我在赋值操作符周围加上空格字符,以此达到强调目的。赋值语句有两个确定而重要的要素:左边和右边。空格字符加强了分隔效果。
On the other hand, I didn’t put spaces between the function names and the opening parenthesis. This is because the function and its arguments are closely related. Separating them makes them appear disjoined instead of conjoined. I separate arguments within the function call parenthesis to accentuate the comma and show that the arguments are separate.
> 另一方面,我不在函数名和左圆括号之间加空格。这是因为函数与其参数密切相关,如果隔开,就会显得互无关系。我把函数调用括号中的参数一一隔开,强调逗号,表示参数是互相分离的。
Another use for white space is to accentuate the precedence of operators.
> 空格字符的另一种用法是强调其前面的运算符。
```java
public class Quadratic {
public static double root1(double a, double b, double c) {
double determinant = determinant(a, b, c);
return (-b + Math.sqrt(determinant)) / (2*a);
}
public static double root2(int a, int b, int c) {
double determinant = determinant(a, b, c);
return (-b - Math.sqrt(determinant)) / (2*a);
}
private static double determinant(double a, double b, double c) {
return b*b - 4*a*c;
}
}
```
Notice how nicely the equations read. The factors have no white space between them because they are high precedence. The terms are separated by white space because addition and subtraction are lower precedence.
> 看看这些等式读起来多舒服。乘法因子之间没加空格,因为它们具有较高优先级。加减法运算项之间用空格隔开,因为加法和减法优先级较低。
Unfortunately, most tools for reformatting code are blind to the precedence of operators and impose the same spacing throughout. So subtle spacings like those shown above tend to get lost after you reformat the code.
> 不幸的是,多数代码格式化工具都会漠视运算符优先级,从头到尾采用同样的空格方式。在重新格式化代码后,以上这些微妙的空格用法就消失殆尽了。
### 5.3.2 Horizontal Alignment 水平对齐
When I was an assembly language programmer,3 I used horizontal alignment to accentuate certain structures. When I started coding in C, C++, and eventually Java, I continued to try to line up all the variable names in a set of declarations, or all the rvalues in a set of assignment statements. My code might have looked like this:
> 当我还是个汇编语言程序员时[3],使用水平对齐来强调某些程序结构。开始用 C、C++编码,最终转向 Java 后,我继续尽力对齐一组声明中的变量名,或一组赋值语句中的右值。我的代码看起来大概是这样:
3. Who am I kidding? I still am an assembly language programmer. You can take the boy away from the metal, but you can’t take the metal out of the boy!
```java
public class FitNesseExpediter implements ResponseSender
{
private Socket socket;
private InputStream input;
private OutputStream output;
private Request request;
private Response response;
private FitNesseContext context;
protected long requestParsingTimeLimit;
private long requestProgress;
private long requestParsingDeadline;
private boolean hasError;
public FitNesseExpediter(Socket s,
FitNesseContext context) throws Exception
{
this.context = context;
socket = s;
input = s.getInputStream();
output = s.getOutputStream();
requestParsingTimeLimit = 10000;
}
```
I have found, however, that this kind of alignment is not useful. The alignment seems to emphasize the wrong things and leads my eye away from the true intent. For example, in the list of declarations above you are tempted to read down the list of variable names without looking at their types. Likewise, in the list of assignment statements you are tempted to look down the list of rvalues without ever seeing the assignment operator. To make matters worse, automatic reformatting tools usually eliminate this kind of alignment.
> 我发现这种对齐方式没什么用。对齐,像是在强调不重要的东西,把我的目光从真正的意义上拉开。例如,在上面的声明列表中,你会从上到下阅读变量名,而忽视了它们的类型。同样,在赋值语句代码清单中,你也会从上到下阅读右值,而对赋值运算符视而不见。更麻烦的是,代码自动格式化工具通常会把这类对齐消除掉。
So, in the end, I don’t do this kind of thing anymore. Nowadays I prefer unaligned declarations and assignments, as shown below, because they point out an important deficiency. If I have long lists that need to be aligned, the problem is the length of the lists, not the lack of alignment. The length of the list of declarations in FitNesseExpediter below suggests that this class should be split up.
> 所以,我最终放弃了这种做法。如今,我更喜欢用不对齐的声明和赋值,如下所示,因为它们指出了重点。如果有较长的列表需要做对齐处理,那问题就是在列表的长度上而不是对齐上。下例 FitNesseExpediter 类中声明列表的长度说明该类应该被拆分了。
```java
public class FitNesseExpediter implements ResponseSender
{
private Socket socket;
private InputStream input;
private OutputStream output;
private Request request;
private Response response;
private FitNesseContext context;
protected long requestParsingTimeLimit;
private long requestProgress;
private long requestParsingDeadline;
private boolean hasError;
public FitNesseExpediter(Socket s, FitNesseContext context) throws Exception
{
this.context = context;
socket = s;
input = s.getInputStream();
output = s.getOutputStream();
requestParsingTimeLimit = 10000;
}
```
### 5.3.3 Indentation 缩进
A source file is a hierarchy rather like an outline. There is information that pertains to the file as a whole, to the individual classes within the file, to the methods within the classes, to the blocks within the methods, and recursively to the blocks within the blocks. Each level of this hierarchy is a scope into which names can be declared and in which declarations and executable statements are interpreted.
> 源文件是一种继承结构,而不是一种大纲结构。其中的信息涉及整个文件、文件中每个类、类中的方法、方法中的代码块,也涉及代码块中的代码块。这种继承结构中的每一层级都圈出一个范围,名称可以在其中声明,而声明和执行语句也可以在其中解释。
To make this hierarchy of scopes visible, we indent the lines of source code in proportion to their position in the hiearchy. Statements at the level of the file, such as most class declarations, are not indented at all. Methods within a class are indented one level to the right of the class. Implementations of those methods are implemented one level to the right of the method declaration. Block implementations are implemented one level to the right of their containing block, and so on.
> 要让这种范围式继承结构可见,我们依源代码行在继承结构中的位置对源代码行做缩进处理。在文件顶层的语句,例如大多数的类声明,根本不缩进。类中的方法相对该类缩进一个层级。方法的实现相对方法声明缩进一个层级。代码块的实现相对于其容器代码块缩进一个层级,以此类推。
Programmers rely heavily on this indentation scheme. They visually line up lines on the left to see what scope they appear in. This allows them to quickly hop over scopes, such as implementations of if or while statements, that are not relevant to their current situation. They scan the left for new method declarations, new variables, and even new classes. Without indentation, programs would be virtually unreadable by humans.
> 程序员相当依赖这种缩进模式。他们从代码行左边查看自己在什么范围中工作。这让他们能快速跳过与当前关注的情形无关的范围,例如 if 或 while 语句的实现之类。他们的眼光扫过左边,查找新的方法声明、新变量,甚至新类。没有缩进的话,程序就会变得无法阅读。
Consider the following programs that are syntactically and semantically identical:
> 试看以下在语法和语义上等价的两个程序:
```java
public class FitNesseServer implements SocketServer { private FitNesseContext
context; public FitNesseServer(FitNesseContext context) { this.context =
context; } public void serve(Socket s) { serve(s, 10000); } public void
serve(Socket s, long requestTimeout) { try { FitNesseExpediter sender = new
FitNesseExpediter(s, context);
sender.setRequestParsingTimeLimit(requestTimeout); sender.start(); }
catch(Exception e) { e.printStackTrace(); } } }
-----
public class FitNesseServer implements SocketServer {
private FitNesseContext context;
public FitNesseServer(FitNesseContext context) {
this.context = context;
}
public void serve(Socket s) {
serve(s, 10000);
}
public void serve(Socket s, long requestTimeout) {
try {
FitNesseExpediter sender = new FitNesseExpediter(s, context);
sender.setRequestParsingTimeLimit(requestTimeout);
sender.start();
}
catch (Exception e) {
e.printStackTrace();
}
}
}
```
Your eye can rapidly discern the structure of the indented file. You can almost instantly spot the variables, constructors, accessors, and methods. It takes just a few seconds to realize that this is some kind of simple front end to a socket, with a time-out. The unindented version, however, is virtually impenetrable without intense study.
> 你能很快地洞悉有缩进的那个文件的结构。你几乎能立即就辨别出那些变量、构造器、存取器和方法。只需要几秒钟就能了解这是一个套接字的简单前端,其中包括了超时设定。而未缩进的版本则不经过一番折腾就无法明白。
Breaking Indentation. It is sometimes tempting to break the indentation rule for short if statements, short while loops, or short functions. Whenever I have succumbed to this temptation, I have almost always gone back and put the indentation back in. So I avoid collapsing scopes down to one line like this:
> 违反缩进规则。有时,会忍不住想要在短小的 if 语句、while 循环或小函数中违反缩进规则。一旦这么做了,我多数时候还是会回头加上缩进。这样就避免了出现以下这种范围层级坍塌到一行的情况:
```java
public class CommentWidget extends TextWidget
{
public static final String REGEXP = “^#[^\r\n]*(?:(?:\r\n)|\n|\r)?”;
public CommentWidget(ParentWidget parent, String text){super(parent, text);}
public String render() throws Exception {return “”; }
}
```
I prefer to expand and indent the scopes instead, like this:
> 我更喜欢扩展和缩进范围,就像这样:
```java
public class CommentWidget extends TextWidget {
public static final String REGEXP = “^#[^\r\n]*(?:(?:\r\n)|\n|\r)?”
public CommentWidget(ParentWidget parent, String text) {
super(parent, text);
}
public String render() throws Exception {
return “”;
}
}
```
### 5.3.4 Dummy Scopes 空范围
Sometimes the body of a while or for statement is a dummy, as shown below. I don’t like these kinds of structures and try to avoid them. When I can’t avoid them, I make sure that the dummy body is properly indented and surrounded by braces. I can’t tell you how many times I’ve been fooled by a semicolon silently sitting at the end of a while loop on the same line. Unless you make that semicolon visible by indenting it on it’s own line, it’s just too hard to see.
> 有时,while 或 for 语句的语句体为空,如下所示。我不喜欢这种结构,尽量不使用。如果无法避免,就确保空范围体的缩进,用括号包围起来。我无法告诉你,我曾经多少次被静静安坐在与 while 循环语句同一行末尾的分号所欺骗。除非你把那个分号放到另一行再加以缩进,否则就很难看到它。
```java
while (dis.read(buf, 0, readBufferSize) != -1) ;
```
## 5.4 TEAM RULES 团队规则
The title of this section is a play on words. Every programmer has his own favorite formatting rules, but if he works in a team, then the team rules.
> 每个程序员都有自己喜欢的格式规则,但如果在一个团队中工作,就是团队说了算[4]。
A team of developers should agree upon a single formatting style, and then every member of that team should use that style. We want the software to have a consistent style. We don’t want it to appear to have been written by a bunch of disagreeing individuals.
> 一组开发者应当认同一种格式风格,每个成员都应该采用那种风格。我们想要让软件拥有一以贯之的风格。我们不想让它显得是由一大票意见相左的个人所写成。

When I started the FitNesse project back in 2002, I sat down with the team to work out a coding style. This took about 10 minutes. We decided where we’d put our braces, what our indent size would be, how we would name classes, variables, and methods, and so forth. Then we encoded those rules into the code formatter of our IDE and have stuck with them ever since. These were not the rules that I prefer; they were rules decided by the team. As a member of that team I followed them when writing code in the FitNesse project.
> 002 年启动 FitNesse 项目时,我和开发团队一起制订了一套编码风格。这只花了我们 10 分钟时间。我们决定了在什么地方放置括号,缩进几个字符,如何命名类、变量和方法,如此等等。然后,我们把这些规则编写进 IDE 的代码格式功能,接着就一直沿用。这些规则并非全是我喜爱的;但它们是团队决定了的规则。作为团队一员,在为 FitNesse 项目编写代码时,我遵循这些规则。
Remember, a good software system is composed of a set of documents that read nicely. They need to have a consistent and smooth style. The reader needs to be able to trust that the formatting gestures he or she has seen in one source file will mean the same thing in others. The last thing we want to do is add more complexity to the source code by writing it in a jumble of different individual styles.
> 记住,好的软件系统是由一系列读起来不错的代码文件组成的。它们需要拥有一致和顺畅的风格。读者要能确信,他们在一个源文件中看到的格式风格在其他文件中也是同样的用法。绝对不要用各种不同的风格来编写源代码,这样会增加其复杂度。
## 5.5 UNCLE BOB’S FORMATTING RULES 鲍勃大叔的格式规则
The rules I use personally are very simple and are illustrated by the code in Listing 5-6. Consider this an example of how code makes the best coding standard document.
> 我个人使用的规则相当简单,如代码清单 5-6 所示。可以把这段代码看作是展示如何把代码写成最好的编码标准文档的范例。
Listing 5-6 CodeAnalyzer.java
> 代码清单 5-6 CodeAnalyzer.java
```java
public int getWidestLineNumber() {
return widestLineNumber;
}
public LineWidthHistogram getLineWidthHistogram() {
return lineWidthHistogram;
}
public double getMeanLineWidth() {
return (double)totalChars/lineCount;
}
public int getMedianLineWidth() {
Integer[] sortedWidths = getSortedWidths();
int cumulativeLineCount = 0;
for (int width : sortedWidths) {
cumulativeLineCount += lineCountForWidth(width);
if (cumulativeLineCount > lineCount/2)
return width;
}
throw new Error(“Cannot get here”);
}
private int lineCountForWidth(int width) {
return lineWidthHistogram.getLinesforWidth(width).size();
}
private Integer[] getSortedWidths() {
Set widths = lineWidthHistogram.getWidths();
Integer[] sortedWidths = (widths.toArray(new Integer[0]));
Arrays.sort(sortedWidths);
return sortedWidths;
}
}
```
================================================
FILE: docs/ch6.md
================================================
# 第 6 章 Objects and Data Structures 对象和数据结构

There is a reason that we keep our variables private. We don’t want anyone else to depend on them. We want to keep the freedom to change their type or implementation on a whim or an impulse. Why, then, do so many programmers automatically add getters and setters to their objects, exposing their private variables as if they were public?
> 将变量设置为私有(private)有一个理由:我们不想其他人依赖这些变量。我们还想在心血来潮时能自由修改其类型或实现。那么,为什么还是有那么多程序员给对象自动添加赋值器和取值器,将私有变量公之于众、如同它们根本就是公共变量一般呢?
## 6.1 DATA ABSTRACTION 数据抽象
Consider the difference between Listing 6-1 and Listing 6-2. Both represent the data of a point on the Cartesian plane. And yet one exposes its implementation and the other completely hides it.
> 看看代码清单 6-1 和代码清单 6-2 之间的区别。每段代码都表示笛卡儿平面上的一个点。不过,其中之一曝露了其实现,而另一个则完全隐藏了其实现。
Listing 6-1 Concrete Point
> 代码清单 6-1 具象点
```java
public class Point {
public double x;
public double y;
}
```
Listing 6-2 Abstract Point
> 代码清单 6-2 抽象点
```java
public interface Point {
double getX();
double getY();
void setCartesian(double x, double y);
double getR();
double getTheta();
void setPolar(double r, double theta);
}
```
The beautiful thing about Listing 6-2 is that there is no way you can tell whether the implementation is in rectangular or polar coordinates. It might be neither! And yet the interface still unmistakably represents a data structure.
> 代码清单 6-2 的漂亮之处在于,你不知道该实现会是在矩形坐标系中还是在极坐标系中。可能两个都不是!然而,该接口还是明白无误地呈现了一种数据结构。
But it represents more than just a data structure. The methods enforce an access policy. You can read the individual coordinates independently, but you must set the coordinates together as an atomic operation.
> 不过它呈现的还不止是一个数据结构。那些方法固定了一套存取策略。你可以单独读取某个坐标,但必须通过一次原子操作设定所有坐标。
Listing 6-1, on the other hand, is very clearly implemented in rectangular coordinates, and it forces us to manipulate those coordinates independently. This exposes implementation. Indeed, it would expose implementation even if the variables were private and we were using single variable getters and setters.
> 而代码清单 6-1 则非常清楚地是在矩形坐标系中实现,并要求我们单个操作那些坐标。这就曝露了实现。实际上,即便变量都是私有,而且我们也通过变量取值器和赋值器使用变量,其实现仍然曝露了。
Hiding implementation is not just a matter of putting a layer of functions between the variables. Hiding implementation is about abstractions! A class does not simply push its variables out through getters and setters. Rather it exposes abstract interfaces that allow its users to manipulate the essence of the data, without having to know its implementation.
> 隐藏实现并非只是在变量之间放上一个函数层那么简单。隐藏实现关乎抽象!类并不简单地用取值器和赋值器将其变量推向外间,而是曝露抽象接口,以便用户无需了解数据的实现就能操作数据本体。
Consider Listing 6-3 and Listing 6-4. The first uses concrete terms to communicate the fuel level of a vehicle, whereas the second does so with the abstraction of percentage. In the concrete case you can be pretty sure that these are just accessors of variables. In the abstract case you have no clue at all about the form of the data.
> 看看代码清单 6-3 和代码清单 6-4。前者使用具象手段与机动车的燃料层通信,而后者则采用百分比抽象。你能确定前者里面都是些变量存取器,而却无法得知后者中的数据形态。
Listing 6-3 Concrete Vehicle
> 代码清单 6-3 具象机动车
```java
public interface Vehicle {
double getFuelTankCapacityInGallons();
double getGallonsOfGasoline();
}
```
Listing 6-4 Abstract Vehicle
> 代码清单 6-4 抽象机动车
```java
public interface Vehicle {
double getPercentFuelRemaining();
}
```
In both of the above cases the second option is preferable. We do not want to expose the details of our data. Rather we want to express our data in abstract terms. This is not merely accomplished by using interfaces and/or getters and setters. Serious thought needs to be put into the best way to represent the data that an object contains. The worst option is to blithely add getters and setters.
> 以上两段代码以后者为佳。我们不愿曝露数据细节,更愿意以抽象形态表述数据。这并不只是用接口和/或赋值器、取值器就万事大吉。要以最好的方式呈现某个对象包含的数据,需要做严肃的思考。傻乐着乱加取值器和赋值器,是最坏的选择。
## 6.2 DATA/OBJECT ANTI-SYMMETRY 数据、对象的反对称性
These two examples show the difference between objects and data structures. Objects hide their data behind abstractions and expose functions that operate on that data. Data structure expose their data and have no meaningful functions. Go back and read that again. Notice the complimentary nature of the two definitions. They are virtual opposites. This difference may seem trivial, but it has far-reaching implications.
> 这两个例子展示了对象与数据结构之间的差异。对象把数据隐藏于抽象之后,曝露操作数据的函数。数据结构曝露其数据,没有提供有意义的函数。回过头再读一遍。留意这两种定义的本质。它们是对立的。这种差异貌似微小,但却有深远的含义。
Consider, for example, the procedural shape example in Listing 6-5. The Geometry class operates on the three shape classes. The shape classes are simple data structures without any behavior. All the behavior is in the Geometry class.
> 例如,代码清单 6-5 中的过程式代码形状范例。Geometry 类操作三个形状类。形状类都是简单的数据结构,没有任何行为。所有行为都在 Geometry 类中。
Listing 6-5 Procedural Shape
> 代码清单 6-5 过程式形状代码
```java
public class Square {
public Point topLeft;
public double side;
}
public class Rectangle {
public Point topLeft;
public double height;
public double width;
}
public class Circle {
public Point center;
public double radius;
}
public class Geometry {
public final double PI = 3.141592653589793;
public double area(Object shape) throws NoSuchShapeException
{
if (shape instanceof Square) {
Square s = (Square)shape;
return s.side * s.side;
}
else if (shape instanceof Rectangle) {
Rectangle r = (Rectangle)shape;
return r.height * r.width;
}
else if (shape instanceof Circle) {
Circle c = (Circle)shape;
return PI * c.radius * c.radius;
}
throw new NoSuchShapeException();
}
}
```
Object-oriented programmers might wrinkle their noses at this and complain that it is procedural—and they’d be right. But the sneer may not be warranted. Consider what would happen if a perimeter() function were added to Geometry. The shape classes would be unaffected! Any other classes that depended upon the shapes would also be unaffected! On the other hand, if I add a new shape, I must change all the functions in Geometry to deal with it. Again, read that over. Notice that the two conditions are diametrically opposed.
> 面向对象程序员可能会对此嗤之以鼻,抱怨说这是过程式代码——他们大概是对的,不过这种嘲笑并不完全正确。想想看,如果给 Geometry 类添加一个 primeter() 函数会怎样。那些形状类根本不会因此而受影响!另一方面,如果添加一个新形状,就得修改 Geometry 中的所有函数来处理它。再读一遍代码。注意,这两种情形也是直接对立的。
Now consider the object-oriented solution in Listing 6-6. Here the area() method is polymorphic. No Geometry class is necessary. So if I add a new shape, none of the existing functions are affected, but if I add a new function all of the shapes must be changed!1
> 现在来看看代码清单 6-6 中的面向对象方案。这里,area() 方法是多态的。不需要有 Geometry 类。所以,如果添加一个新形状,现有的函数一个也不会受到影响,而当添加新函数时所有的形状都得做修改[1]!
1. There are ways around this that are well known to experienced object-oriented designers: VISITOR, or dual-dispatch, for example. But these techniques carry costs of their own and generally return the structure to that of a procedural program.
Listing 6-6 Polymorphic Shapes
> 代码清单 6-6 多态式形状
```java
public class Square implements Shape {
private Point topLeft;
private double side;
public double area() {
return side*side;
}
}
public class Rectangle implements Shape {
private Point topLeft;
private double height;
private double width;
public double area() {
return height * width;
}
}
public class Circle implements Shape {
private Point center;
private double radius;
public final double PI = 3.141592653589793;
public double area() {
return PI * radius * radius;
}
}
```
Again, we see the complimentary nature of these two definitions; they are virtual opposites! This exposes the fundamental dichotomy between objects and data structures:
> 我们再次看到这两种定义的本质;它们是截然对立的。这说明了对象与数据结构之间的二分原理:
Procedural code (code using data structures) makes it easy to add new functions without changing the existing data structures. OO code, on the other hand, makes it easy to add new classes without changing existing functions.
> 过程式代码(使用数据结构的代码)便于在不改动既有数据结构的前提下添加新函数。面向对象代码便于在不改动既有函数的前提下添加新类。
The complement is also true:
> 反过来讲也说得通:
Procedural code makes it hard to add new data structures because all the functions must change. OO code makes it hard to add new functions because all the classes must change.
> 过程式代码难以添加新数据结构,因为必须修改所有函数。面向对象代码难以添加新函数,因为必须修改所有类。
So, the things that are hard for OO are easy for procedures, and the things that are hard for procedures are easy for OO!
> 所以,对于面向对象较难的事,对于过程式代码却较容易,反之亦然!
In any complex system there are going to be times when we want to add new data types rather than new functions. For these cases objects and OO are most appropriate. On the other hand, there will also be times when we’ll want to add new functions as opposed to data types. In that case procedural code and data structures will be more appropriate.
> 在任何一个复杂系统中,都会有需要添加新数据类型而不是新函数的时候。这时,对象和面向对象就比较适合。另一方面,也会有想要添加新函数而不是数据类型的时候。在这种情况下,过程式代码和数据结构更合适。
Mature programmers know that the idea that everything is an object is a myth. Sometimes you really do want simple data structures with procedures operating on them.
## 6.3 THE LAW OF DEMETER 得墨忒耳律
There is a well-known heuristic called the Law of Demeter2 that says a module should not know about the innards of the objects it manipulates. As we saw in the last section, objects hide their data and expose operations. This means that an object should not expose its internal structure through accessors because to do so is to expose, rather than to hide, its internal structure.
> 著名的得墨忒耳律(The Law of Demeter)[2]认为,模块不应了解它所操作对象的内部情形。如上节所见,对象隐藏数据,曝露操作。这意味着对象不应通过存取器曝露其内部结构,因为这样更像是曝露而非隐藏其内部结构。
2. http://en.wikipedia.org/wiki/Law_of_Demeter
More precisely, the Law of Demeter says that a method f of a class C should only call the methods of these:
> 更准确地说,得墨忒耳律认为,类 C 的方法 f 只应该调用以下对象的方法:
- C
- An object created by f
- An object passed as an argument to f
- An object held in an instance variable of C
---
> - C
> - 由 f 创建的对象;
> - 作为参数传递给 f 的对象;
> - 由 C 的实体变量持有的对象。
The method should not invoke methods on objects that are returned by any of the allowed functions. In other words, talk to friends, not to strangers.
> 方法不应调用由任何函数返回的对象的方法。换言之,只跟朋友谈话,不与陌生人谈话。
The following code3 appears to violate the Law of Demeter (among other things) because it calls the getScratchDir() function on the return value of getOptions() and then calls getAbsolutePath() on the return value of getScratchDir().
> 下列代码[3]违反了得墨忒耳律(除了违反其他规则之外),因为它调用了 getOptions( )返回值的 getScratchDir( )函数,又调用了 getScratchDir( )返回值的 getAbsolutePath( )方法。
3. Found somewhere in the apache framework.
```java
final String outputDir = ctxt.getOptions().getScratchDir().getAbsolutePath();
```
### 6.3.1 Train Wrecks 火车失事
This kind of code is often called a train wreck because it look like a bunch of coupled train cars. Chains of calls like this are generally considered to be sloppy style and should be avoided [G36]. It is usually best to split them up as follows:
> 这类代码常被称作火车失事,因为它看起来就像是一列火车。这类连串的调用通常被认为是肮脏的风格,应该避免[G36]。最好做类似如下的切分:
```java
Options opts = ctxt.getOptions();
File scratchDir = opts.getScratchDir();
final String outputDir = scratchDir.getAbsolutePath();
```
Are these two snippets of code violations of the Law of Demeter? Certainly the containing module knows that the ctxt object contains options, which contain a scratch directory, which has an absolute path. That’s a lot of knowledge for one function to know. The calling function knows how to navigate through a lot of different objects.
> 上列代码是否违反了得墨忒耳律呢?当然,模块知道 ctxt 对象包含有多个选项,每个选项中都有一个临时目录,而每个临时目录都有一个绝对路径。对于一个函数,这些知识真够丰富的。调用函数懂得如何在一大堆不同对象间浏览。

Whether this is a violation of Demeter depends on whether or not ctxt, Options, and ScratchDir are objects or data structures. If they are objects, then their internal structure should be hidden rather than exposed, and so knowledge of their innards is a clear violation of the Law of Demeter. On the other hand, if ctxt, Options, and ScratchDir are just data structures with no behavior, then they naturally expose their internal structure, and so Demeter does not apply.
> 这些代码是否违反得墨忒耳律,取决于 ctxt、Options 和 ScratchDir 是对象还是数据结构。如果是对象,则它们的内部结构应当隐藏而不曝露,而有关其内部细节的知识就明显违反了得墨忒耳律。如果 ctxt、Options 和 ScratchDir 只是数据结构,没有任何行为,则它们自然会曝露其内部结构,得墨忒耳律也就不适用了。
The use of accessor functions confuses the issue. If the code had been written as follows, then we probably wouldn’t be asking about Demeter violations.
> 属性访问器函数的使用把问题搞复杂了。如果像下面这样写代码,我们大概就不会提及对得墨忒耳律的违反。
```java
final String outputDir = ctxt.options.scratchDir.absolutePath;
```
This issue would be a lot less confusing if data structures simply had public variables and no functions, whereas objects had private variables and public functions. However, there are frameworks and standards (e.g., “beans”) that demand that even simple data structures have accessors and mutators.
> 如果数据结构只简单地拥有公共变量,没有函数,而对象则拥有私有变量和公共函数,这个问题就不那么混淆。然而,有些框架和标准甚至要求最简单的数据结构都要有访问器和改值器。
### 6.3.2 Hybrids 混杂
This confusion sometimes leads to unfortunate hybrid structures that are half object and half data structure. They have functions that do significant things, and they also have either public variables or public accessors and mutators that, for all intents and purposes, make the private variables public, tempting other external functions to use those variables the way a procedural program would use a data structure.4
> 这种混淆有时会不幸导致混合结构,一半是对象,一半是数据结构。这种结构拥有执行操作的函数,也有公共变量或公共访问器及改值器。无论出于怎样的初衷,公共访问器及改值器都把私有变量公开化,诱导外部函数以过程式程序使用数据结构的方式使用这些变量[4]。
4. This is sometimes called Feature Envy from [Refactoring].
Such hybrids make it hard to add new functions but also make it hard to add new data structures. They are the worst of both worlds. Avoid creating them. They are indicative of a muddled design whose authors are unsure of—or worse, ignorant of—whether they need protection from functions or types.
> 此类混杂增加了添加新函数的难度,也增加了添加新数据结构的难度,两面不讨好。应避免创造这种结构。它们的出现,展示了一种乱七八糟的设计,其作者不确定——或者更糟糕,完全无视——他们是否需要函数或类型的保护。
### 6.3.3 Hiding Structure 隐藏结构
What if ctxt, options, and scratchDir are objects with real behavior? Then, because objects are supposed to hide their internal structure, we should not be able to navigate through them. How then would we get the absolute path of the scratch directory?
> 假使 ctxt、Options 和 ScratchDir 是拥有真实行为的对象又怎样呢?由于对象应隐藏其内部结构,我们就不该能够看到内部结构。这样一来,如何才能取得临时目录的绝对路径呢?
```java
ctxt.getAbsolutePathOfScratchDirectoryOption();
```
or
> 或者
```java
ctx.getScratchDirectoryOption().getAbsolutePath()
```
The first option could lead to an explosion of methods in the ctxt object. The second presumes that getScratchDirectoryOption() returns a data structure, not an object. Neither option feels good.
> 第一种方案可能导致 ctxt 对象中方法的曝露。第二种方案是在假设 getScratchDirectoryOption()返回一个数据结构而非对象。两种方案感觉都不好。
If ctxt is an object, we should be telling it to do something; we should not be asking it about its internals. So why did we want the absolute path of the scratch directory? What were we going to do with it? Consider this code from (many lines farther down in) the same module:
> 如果 ctxt 是个对象,就应该要求它做点什么,不该要求它给出内部情形。那我们为何还要得到临时目录的绝对路径呢?我们要它做什么?来看看同一模块(许多行之后)的这段代码:
```java
String outFile = outputDir + “/” + className.replace('.', '/') + “.class”;
FileOutputStream fout = new FileOutputStream(outFile);
BufferedOutputStream bos = new BufferedOutputStream(fout);
```
The admixture of different levels of detail [G34][g6] is a bit troubling. Dots, slashes, file extensions, and File objects should not be so carelessly mixed together, and mixed with the enclosing code. Ignoring that, however, we see that the intent of getting the absolute path of the scratch directory was to create a scratch file of a given name.
> 这种不同层级细节的混杂([G34][g36])有点麻烦。句点、斜杠、文件扩展名和 File 对象不该如此随便地混杂到一起。不过,撇开这些毛病,我们发现,取得临时目录绝对路径的初衷是为了创建指定名称的临时文件。
So, what if we told the ctxt object to do this?
> 所以,直接让 ctxt 对象来做这事如何?
```java
BufferedOutputStream bos = ctxt.createScratchFileStream(classFileName);
```
That seems like a reasonable thing for an object to do! This allows ctxt to hide its internals and prevents the current function from having to violate the Law of Demeter by navigating through objects it shouldn’t know about.
> 这下看起来像是个对象做的事了!ctxt 隐藏了其内部结构,防止当前函数因浏览它不该知道的对象而违反得墨忒耳律。
## 6.4 DATA TRANSFER OBJECTS 数据传送对象
The quintessential form of a data structure is a class with public variables and no functions. This is sometimes called a data transfer object, or DTO. DTOs are very useful structures, especially when communicating with databases or parsing messages from sockets, and so on. They often become the first in a series of translation stages that convert raw data in a database into objects in the application code.
> 最为精练的数据结构,是一个只有公共变量、没有函数的类。这种数据结构有时被称为数据传送对象,或 DTO(Data Transfer Objects)。 DTO 是非常有用的结构,尤其是在与数据库通信、或解析套接字传递的消息之类场景中。在应用程序代码里一系列将原始数据转换为数据库的翻译过程中,它们往往是排头兵。
Somewhat more common is the “bean” form shown in Listing 6-7. Beans have private variables manipulated by getters and setters. The quasi-encapsulation of beans seems to make some OO purists feel better but usually provides no other benefit.
> 更常见的是如代码清单 6-7 所示的“豆”(bean)结构。豆结构拥有由赋值器和取值器操作的私有变量。对豆结构的半封装会让某些 OO 纯化论者感觉舒服些,不过通常没有其他好处。
Listing 6-7 address.java
> 代码清单 6-7 address.java
```java
public class Address {
private String street;
private String streetExtra;
private String city;
private String state;
private String zip;
public Address(String street, String streetExtra,
String city, String state, String zip) {
this.street = street;
this.streetExtra = streetExtra;
this.city = city;
this.state = state;
this.zip = zip;
}
public String getStreet() {
return street;
}
public String getStreetExtra() {
return streetExtra;
}
public String getCity() {
return city;
}
public String getState() {
return state;
}
public String getZip() {
return zip;
}
}
```
### Active Record
Active Records are special forms of DTOs. They are data structures with public (or bean-accessed) variables; but they typically have navigational methods like save and find. Typically these Active Records are direct translations from database tables, or other data sources.
> Active Record 是一种特殊的 DTO 形式。它们是拥有公共(或可豆式访问的)变量的数据结构,但通常也会拥有类似 save 和 find 这样的可浏览方法。Active Record 一般是对数据库表或其他数据源的直接翻译。
Unfortunately we often find that developers try to treat these data structures as though they were objects by putting business rule methods in them. This is awkward because it creates a hybrid between a data structure and an object.
> 我们不幸经常发现开发者往这类数据结构中塞进业务规则方法,把这类数据结构当成对象来用。这是不智的行为,因为它导致了数据结构和对象的混杂体。
The solution, of course, is to treat the Active Record as a data structure and to create separate objects that contain the business rules and that hide their internal data (which are probably just instances of the Active Record).
> 当然,解决方案就是把 Active Record 当做数据结构,并创建包含业务规则、隐藏内部数据(可能就是 Active Record 的实体)的独立对象。
## 6.5 CONCLUSION 小结
Objects expose behavior and hide data. This makes it easy to add new kinds of objects without changing existing behaviors. It also makes it hard to add new behaviors to existing objects. Data structures expose data and have no significant behavior. This makes it easy to add new behaviors to existing data structures but makes it hard to add new data structures to existing functions.
> 对象曝露行为,隐藏数据。便于添加新对象类型而无需修改既有行为,同时也难以在既有对象中添加新行为。数据结构曝露数据,没有明显的行为。便于向既有数据结构添加新行为,同时也难以向既有函数添加新数据结构。
In any given system we will sometimes want the flexibility to add new data types, and so we prefer objects for that part of the system. Other times we will want the flexibility to add new behaviors, and so in that part of the system we prefer data types and procedures. Good software developers understand these issues without prejudice and choose the approach that is best for the job at hand.
> 在任何系统中,我们有时会希望能够灵活地添加新数据类型,所以更喜欢在这部分使用对象。另外一些时候,我们希望能灵活地添加新行为,这时我们更喜欢使用数据类型和过程。优秀的软件开发者不带成见地了解这种情形,并依据手边工作的性质选择其中一种手段。
================================================
FILE: docs/ch7.md
================================================
# 第 7 章 Error Handling 错误处理
by Michael Feathers

It might seem odd to have a section about error handling in a book about clean code. Error handling is just one of those things that we all have to do when we program. Input can be abnormal and devices can fail. In short, things can go wrong, and when they do, we as programmers are responsible for making sure that our code does what it needs to do.
> 在一本有关整洁代码的书中,居然有讨论错误处理的章节,看起来有些突兀。错误处理只不过是编程时必须要做的事之一。输入可能出现异常,设备可能失效。简言之,可能会出错,当错误发生时,程序员就有责任确保代码照常工作。
The connection to clean code, however, should be clear. Many code bases are completely dominated by error handling. When I say dominated, I don’t mean that error handling is all that they do. I mean that it is nearly impossible to see what the code does because of all of the scattered error handling. Error handling is important, but if it obscures logic, it’s wrong.
> 然而,应该弄清楚错误处理与整洁代码的关系。许多程序完全由错误处理所占据。所谓占据,并不是说错误处理就是全部。我的意思是几乎无法看明白代码所做的事,因为到处都是凌乱的错误处理代码。错误处理很重要,但如果它搞乱了代码逻辑,就是错误的做法。
In this chapter I’ll outline a number of techniques and considerations that you can use to write code that is both clean and robust—code that handles errors with grace and style.
> 在本章中,我将概要列出编写既整洁又强固的代码——雅致地处理错误代码的一些技巧和思路。
## 7.1 USE EXCEPTIONS RATHER THAN RETURN CODES 使用异常而非返回码
Back in the distant past there were many languages that didn’t have exceptions. In those languages the techniques for handling and reporting errors were limited. You either set an error flag or returned an error code that the caller could check. The code in Listing 7-1 illustrates these approaches.
> 在很久以前,许多语言都不支持异常。这些语言处理和汇报错误的手段都有限。你要么设置一个错误标识,要么返回给调用者检查的错误码。代码清单 7-1 中的代码展示了这些手段。
Listing 7-1 DeviceController.java
> 代码清单 7-1 DeviceController.java
```java
public class DeviceController {
…
public void sendShutDown() {
DeviceHandle handle = getHandle(DEV1);
// Check the state of the device
if (handle != DeviceHandle.INVALID) {
// Save the device status to the record field
retrieveDeviceRecord(handle);
// If not suspended, shut down
if (record.getStatus() != DEVICE_SUSPENDED) {
pauseDevice(handle);
clearDeviceWorkQueue(handle);
closeDevice(handle);
} else {
logger.log("Device suspended. Unable to shut down");
}
} else {
logger.log("Invalid handle for: " + DEV1.toString());
}
}
…
}
```
The problem with these approaches is that they clutter the caller. The caller must check for errors immediately after the call. Unfortunately, it’s easy to forget. For this reason it is better to throw an exception when you encounter an error. The calling code is cleaner. Its logic is not obscured by error handling.
> 这类手段的问题在于,它们搞乱了调用者代码。调用者必须在调用之后即刻检查错误。不幸的是,这个步骤很容易被遗忘。所以,遇到错误时,最好抛出一个异常。调用代码很整洁,其逻辑不会被错误处理搞乱。
Listing 7-2 shows the code after we’ve chosen to throw exceptions in methods that can detect errors.
> 代码清单 7-2 展示了在方法中遇到错误时抛出异常的情形。
Listing 7-2 DeviceController.java (with exceptions)
> 代码清单 7-2 展示了在方法中遇到错误时抛出异常的情形。
```java
public class DeviceController {
…
public void sendShutDown() {
try {
tryToShutDown();
} catch (DeviceShutDownError e) {
logger.log(e);
}
}
private void tryToShutDown() throws DeviceShutDownError {
DeviceHandle handle = getHandle(DEV1);
DeviceRecord record = retrieveDeviceRecord(handle);
pauseDevice(handle);
clearDeviceWorkQueue(handle);
closeDevice(handle);
}
private DeviceHandle getHandle(DeviceID id) {
…
throw new DeviceShutDownError(“Invalid handle for: ” + id.toString());
…
}
…
}
```
Notice how much cleaner it is. This isn’t just a matter of aesthetics. The code is better because two concerns that were tangled, the algorithm for device shutdown and error handling, are now separated. You can look at each of those concerns and understand them independently.
> 注意这段代码整洁了很多。这不仅关乎美观。这段代码更好,因为之前纠结的两个元素设备关闭算法和错误处理现在被隔离了。你可以查看其中任一元素,分别理解它。
## 7.2 WRITE YOUR TRY-CATCH-FINALLY STATEMENT FIRST 先写 Try-Catch-Finally 语句
One of the most interesting things about exceptions is that they define a scope within your program. When you execute code in the try portion of a try-catch-finally statement, you are stating that execution can abort at any point and then resume at the catch.
> 异常的妙处之一是,它们在程序中定义了一个范围。执行 try-catchfinally 语句中 try 部分的代码时,你是在表明可随时取消执行,并在 catch 语句中接续。
In a way, try blocks are like transactions. Your catch has to leave your program in a consistent state, no matter what happens in the try. For this reason it is good practice to start with a try-catch-finally statement when you are writing code that could throw exceptions. This helps you define what the user of that code should expect, no matter what goes wrong with the code that is executed in the try.
> 在某种意义上,try 代码块就像是事务。catch 代码块将程序维持在一种持续状态,无论 try 代码块中发生了什么均如此。所以,在编写可能抛出异常的代码时,最好先写出 try-catch-finally 语句。这能帮你定义代码的用户应该期待什么,无论 try 代码块中执行的代码出什么错都一样。
Let’s look at an example. We need to write some code that accesses a file and reads some serialized objects.
> 来看个例子。我们要编写访问某个文件并读出一些序列化对象的代码。
We start with a unit test that shows that we’ll get an exception when the file doesn’t exist:
> 先写一个单元测试,其中显示当文件不存在时将得到一个异常:
```java
@Test(expected = StorageException.class)
public void retrieveSectionShouldThrowOnInvalidFileName() {
sectionStore.retrieveSection(“invalid - file”);
}
```
The test drives us to create this stub:
> 该测试驱动我们创建以下占位代码:
```java
public List retrieveSection(String sectionName) {
// dummy return until we have a real implementation
return new ArrayList();
}
```
Our test fails because it doesn’t throw an exception. Next, we change our implementation so that it attempts to access an invalid file. This operation throws an exception:
> 测试失败了,因为以上代码并未抛出异常。下一步,修改实现代码,尝试访问非法文件。该操作抛出一个异常:
```java
public List retrieveSection(String sectionName) {
try {
FileInputStream stream = new FileInputStream(sectionName)
} catch (Exception e) {
throw new StorageException(“retrieval error”, e);
}
return new ArrayList();
}
```
Our test passes now because we’ve caught the exception. At this point, we can refactor. We can narrow the type of the exception we catch to match the type that is actually thrown from the FileInputStream constructor: FileNotFoundException:
> 这次测试通过了,因为我们捕获了异常。此时,我们可以重构了。我们可以缩小异常类型的范围,使之符合 FileInputStream 构造器真正抛出的异常,即 FileNotFoundException:
```java
public List retrieveSection(String sectionName) {
try {
FileInputStream stream = new FileInputStream(sectionName);
stream.close();
} catch (FileNotFoundException e) {
throw new StorageException(“retrieval error”, e);
}
return new ArrayList();
}
```
Now that we’ve defined the scope with a try-catch structure, we can use TDD to build up the rest of the logic that we need. That logic will be added between the creation of the FileInputStream and the close, and can pretend that nothing goes wrong.
> 如此一来,我们就用 try-catch 结构定义了一个范围,可以继续用测试驱动(TDD)方法构建剩余的代码逻辑。这些代码逻辑将在 FileInputStream 和 close 之间添加,装作一切正常的样子。
Try to write tests that force exceptions, and then add behavior to your handler to satisfy your tests. This will cause you to build the transaction scope of the try block first and will help you maintain the transaction nature of that scope.
> 尝试编写强行抛出异常的测试,再往处理器中添加行为,使之满足测试要求。结果就是你要先构造 try 代码块的事务范围,而且也会帮助你维护好该范围的事务特征。
## 7.3 USE UNCHECKED EXCEPTIONS 使用不可控异常
The debate is over. For years Java programmers have debated over the benefits and liabilities of checked exceptions. When checked exceptions were introduced in the first version of Java, they seemed like a great idea. The signature of every method would list all of the exceptions that it could pass to its caller. Moreover, these exceptions were part of the type of the method. Your code literally wouldn’t compile if the signature didn’t match what your code could do.
> 辩论业已结束。多年来,Java 程序员们一直在争论可控异常(checked exception)的利与弊。Java 的第一个版本中引入可控异常时,看似一个极好的点子。每个方法的签名都列出它可能传递给调用者的异常。而且,这些异常就是方法类型的一部分。如果签名与代码实际所做之事不符,代码在字面上就无法编译。
At the time, we thought that checked exceptions were a great idea; and yes, they can yield some benefit. However, it is clear now that they aren’t necessary for the production of robust software. C# doesn’t have checked exceptions, and despite valiant attempts, C++ doesn’t either. Neither do Python or Ruby. Yet it is possible to write robust software in all of these languages. Because that is the case, we have to decide—really—whether checked exceptions are worth their price.
> 那时,我们认为可控异常是个绝妙的主意;而且,它也有所裨益。然而,现在已经很清楚,对于强固软件的生产,它并非必需。C#不支持可控异常。尽管做过勇敢的尝试,C++最后也不支持可控异常。Python 和 Ruby 同样如此。不过,用这些语言也有可能写出强固的软件。我们得决定——的确如此——可控异常是否值回票价。
What price? The price of checked exceptions is an Open/Closed Principle1 violation. If you throw a checked exception from a method in your code and the catch is three levels above, you must declare that exception in the signature of each method between you and the catch. This means that a change at a low level of the software can force signature changes on many higher levels. The changed modules must be rebuilt and redeployed, even though nothing they care about changed.
> 代价是什么?可控异常的代价就是违反开放/闭合原则[1]。如果你在方法中抛出可控异常,而 catch 语句在三个层级之上,你就得在 catch 语句和抛出异常处之间的每个方法签名中声明该异常。这意味着对软件中较低层级的修改,都将波及较高层级的签名。修改好的模块必须重新构建、发布,即便它们自身所关注的任何东西都没改动过。
1. [Martin].
Consider the calling hierarchy of a large system. Functions at the top call functions below them, which call more functions below them, ad infinitum. Now let’s say one of the lowest level functions is modified in such a way that it must throw an exception. If that exception is checked, then the function signature must add a throws clause. But this means that every function that calls our modified function must also be modified either to catch the new exception or to append the appropriate throws clause to its signature. Ad infinitum. The net result is a cascade of changes that work their way from the lowest levels of the software to the highest! Encapsulation is broken because all functions in the path of a throw must know about details of that low-level exception. Given that the purpose of exceptions is to allow you to handle errors at a distance, it is a shame that checked exceptions break encapsulation in this way.
> 以某个大型系统的调用层级为例。顶端函数调用它们之下的函数,逐级向下。假设某个位于最底层级的函数被修改为抛出一个异常。如果该异常是可控的,则函数签名就要添加 throw 子句。这意味着每个调用该函数的函数都要修改,捕获新异常,或在其签名中添加合适的 throw 子句。以此类推。最终得到的就是一个从软件最底端贯穿到最高端的修改链!封装被打破了,因为在抛出路径中的每个函数都要去了解下一层级的异常细节。既然异常旨在让你能在较远处处理错误,可控异常以这种方式破坏封装简直就是一种耻辱。
Checked exceptions can sometimes be useful if you are writing a critical library: You must catch them. But in general application development the dependency costs outweigh the benefits.
> 如果你在编写一套关键代码库,则可控异常有时也会有用:你必须捕获异常。但对于一般的应用开发,其依赖成本要高于收益。
## 7.4 PROVIDE CONTEXT WITH EXCEPTIONS 给出异常发生的环境说明
Each exception that you throw should provide enough context to determine the source and location of an error. In Java, you can get a stack trace from any exception; however, a stack trace can’t tell you the intent of the operation that failed.
> 你抛出的每个异常,都应当提供足够的环境说明,以便判断错误的来源和处所。在 Java 中,你可以从任何异常里得到堆栈踪迹(stack trace);然而,堆栈踪迹却无法告诉你该失败操作的初衷。
Create informative error messages and pass them along with your exceptions. Mention the operation that failed and the type of failure. If you are logging in your application, pass along enough information to be able to log the error in your catch.
> 应创建信息充分的错误消息,并和异常一起传递出去。在消息中,包括失败的操作和失败类型。如果你的应用程序有日志系统,传递足够的信息给 catch 块,并记录下来。
## 7.5 DEFINE EXCEPTION CLASSES IN TERMS OF A CALLER’S NEEDS 依调用者需要定义异常类
There are many ways to classify errors. We can classify them by their source: Did they come from one component or another? Or their type: Are they device failures, network failures, or programming errors? However, when we define exception classes in an application, our most important concern should be how they are caught.
> 对错误分类有很多方式。可以依其来源分类:是来自组件还是其他地方?或依其类型分类:是设备错误、网络错误还是编程错误?不过,当我们在应用程序中定义异常类时,最重要的考虑应该是它们如何被捕获。
Let’s look at an example of poor exception classification. Here is a try-catch-finally statement for a third-party library call. It covers all of the exceptions that the calls can throw:
> 来看一个不太好的异常分类例子。下面的 try-catch-finally 语句是对某个第三方代码库的调用。它覆盖了该调用可能抛出的所有异常:
```java
ACMEPort port = new ACMEPort(12);
try {
port.open();
} catch (DeviceResponseException e) {
reportPortError(e);
logger.log(“Device response exception”, e);
} catch (ATM1212UnlockedException e) {
reportPortError(e);
logger.log(“Unlock exception”, e);
} catch (GMXError e) {
reportPortError(e);
logger.log(“Device response exception”);
} finally {
…
}
```
That statement contains a lot of duplication, and we shouldn’t be surprised. In most exception handling situations, the work that we do is relatively standard regardless of the actual cause. We have to record an error and make sure that we can proceed.
> 语句包含了一大堆重复代码,这并不出奇。在大多数异常处理中,不管真实原因如何,我们总是做相对标准的处理。我们得记录错误,确保能继续工作。
In this case, because we know that the work that we are doing is roughly the same regardless of the exception, we can simplify our code considerably by wrapping the API that we are calling and making sure that it returns a common exception type:
> 在本例中,既然知道我们所做的事不外如此,就可以通过打包调用 API、确保它返回通用异常类型,从而简化代码。
```java
LocalPort port = new LocalPort(12);
try {
port.open();
} catch (PortDeviceFailure e) {
reportError(e);
logger.log(e.getMessage(), e);
} finally {
…
}
```
Our LocalPort class is just a simple wrapper that catches and translates exceptions thrown by the ACMEPort class:
> 在本例中,既然知道我们所做的事不外如此,就可以通过打包调用 API、确保它返回通用异常类型,从而简化代码。
```java
public class LocalPort {
private ACMEPort innerPort;
public LocalPort(int portNumber) {
innerPort = new ACMEPort(portNumber);
}
public void open() {
try {
innerPort.open();
} catch (DeviceResponseException e) {
throw new PortDeviceFailure(e);
} catch (ATM1212UnlockedException e) {
throw new PortDeviceFailure(e);
} catch (GMXError e) {
throw new PortDeviceFailure(e);
}
}
…
}
```
Wrappers like the one we defined for ACMEPort can be very useful. In fact, wrapping third-party APIs is a best practice. When you wrap a third-party API, you minimize your dependencies upon it: You can choose to move to a different library in the future without much penalty. Wrapping also makes it easier to mock out third-party calls when you are testing your own code.
> 类似我们为 ACMEPort 定义的这种打包类非常有用。实际上,将第三方 API 打包是个良好的实践手段。当你打包一个第三方 API,你就降低了对它的依赖:未来你可以不太痛苦地改用其他代码库。在你测试自己的代码时,打包也有助于模拟第三方调用。
One final advantage of wrapping is that you aren’t tied to a particular vendor’s API design choices. You can define an API that you feel comfortable with. In the preceding example, we defined a single exception type for port device failure and found that we could write much cleaner code.
> 打包的好处还在于你不必绑死在某个特定厂商的 API 设计上。你可以定义自己感觉舒服的 API。在上例中,我们为 port 设备错误定义了一个异常类型,然后发现这样能写出更整洁的代码。
Often a single exception class is fine for a particular area of code. The information sent with the exception can distinguish the errors. Use different classes only if there are times when you want to catch one exception and allow the other one to pass through.
> 对于代码的某个特定区域,单一异常类通常可行。伴随异常发送出来的信息能够区分不同错误。如果你想要捕获某个异常,并且放过其他异常,就使用不同的异常类。
## 7.6 DEFINE THE NORMAL FLOW 定义常规流程
If you follow the advice in the preceding sections, you’ll end up with a good amount of separation between your business logic and your error handling. The bulk of your code will start to look like a clean unadorned algorithm. However, the process of doing this pushes error detection to the edges of your program. You wrap external APIs so that you can throw your own exceptions, and you define a handler above your code so that you can deal with any aborted computation. Most of the time this is a great approach, but there are some times when you may not want to abort.
> 如果你遵循前文提及的建议,在业务逻辑和错误处理代码之间就会有良好的区隔。大量代码会开始变得像是整洁而简朴的算法。然而,这样做却把错误检测推到了程序的边缘地带。你打包了外部 API 以抛出自己的异常,你在代码的顶端定义了一个处理器来应付任何失败了的运算。在大多数时候,这种手段很棒,不过有时你也许不愿这么做。

Let’s take a look at an example. Here is some awkward code that sums expenses in a billing application:
> 来看一个例子。下面的笨代码来自某个记账应用的开支总计模块:
```java
try {
MealExpenses expenses = expenseReportDAO.getMeals(employee.getID());
m_total += expenses.getTotal();
} catch(MealExpensesNotFound e) {
m_total += getMealPerDiem();
}
```
In this business, if meals are expensed, they become part of the total. If they aren’t, the employee gets a meal per diem amount for that day. The exception clutters the logic. Wouldn’t it be better if we didn’t have to deal with the special case? If we didn’t, our code would look much simpler. It would look like this:
> 业务逻辑是,如果消耗了餐食,则计入总额中。如果没有消耗,则员工得到当日餐食补贴。异常打断了业务逻辑。如果不去处理特殊情况会不会好一些?那样的话代码看起来会更简洁。就像这样:
```java
MealExpenses expenses = expenseReportDAO.getMeals(employee.getID());
m_total += expenses.getTotal();
```
Can we make the code that simple? It turns out that we can. We can change the ExpenseReportDAO so that it always returns a MealExpense object. If there are no meal expenses, it returns a MealExpense object that returns the per diem as its total:
> 其总是返回 MealExpense 对象。如果没有餐食消耗,就返回一个返回餐食补贴的 MealExpense 对象。
```java
public class PerDiemMealExpenses implements MealExpenses {
public int getTotal() {
// return the per diem default
}
}
```
This is called the SPECIAL CASE PATTERN [Fowler]. You create a class or configure an object so that it handles a special case for you. When you do, the client code doesn’t have to deal with exceptional behavior. That behavior is encapsulated in the special case object.
> 这种手法叫做特例模式(SPECIAL CASE PATTERN,[Fowler])。创建一个类或配置一个对象,用来处理特例。你来处理特例,客户代码就不用应付异常行为了。异常行为被封装到特例对象中。
## 7.7 DON’T RETURN NULL 别返回 null 值
I think that any discussion about error handling should include mention of the things we do that invite errors. The first on the list is returning null. I can’t begin to count the number of applications I’ve seen in which nearly every other line was a check for null. Here is some example code:
> 我认为,要讨论错误处理,就一定要提及那些容易引发错误的做法。第一项就是返回 null 值。我不想去计算曾经见过多少几乎每行代码都在检查 null 值的应用程序。下面就是个例子:
```java
public void registerItem(Item item) {
if (item != null) {
ItemRegistry registry = peristentStore.getItemRegistry();
if (registry != null) {
Item existing = registry.getItem(item.getID());
if (existing.getBillingPeriod().hasRetailOwner()) {
existing.register(item);
}
}
}
}
```
If you work in a code base with code like this, it might not look all that bad to you, but it is bad! When we return null, we are essentially creating work for ourselves and foisting problems upon our callers. All it takes is one missing null check to send an application spinning out of control.
> 这种代码看似不坏,其实糟透了!返回 null 值,基本上是在给自己增加工作量,也是在给调用者添乱。只要有一处没检查 null 值,应用程序就会失控。
Did you notice the fact that there wasn’t a null check in the second line of that nested if statement? What would have happened at runtime if persistentStore were null? We would have had a NullPointerException at runtime, and either someone is catching NullPointerException at the top level or they are not. Either way it’s bad. What exactly should you do in response to a NullPointerException thrown from the depths of your application?
> 你有没有注意到,嵌套 if 语句的第二行没有检查 null 值?如果在运行时 persistentStore 为 null 会发生什么事?我们会在运行时得到一个 NullPointerException 异常,也许有人在代码顶端捕获这个异常,也可能没有捕获。两种情况都很糟糕。对于从应用程序深处抛出的 NullPointerException 异常,你到底该作何反应呢?
It’s easy to say that the problem with the code above is that it is missing a null check, but in actuality, the problem is that it has too many. If you are tempted to return null from a method, consider throwing an exception or returning a SPECIAL CASE object instead. If you are calling a null-returning method from a third-party API, consider wrapping that method with a method that either throws an exception or returns a special case object.
> 可以敷衍说上列代码的问题是少做了一次 null 值检查,其实问题多多。如果你打算在方法中返回 null 值,不如抛出异常,或是返回特例对象。如果你在调用某个第三方 API 中可能返回 null 值的方法,可以考虑用新方法打包这个方法,在新方法中抛出异常或返回特例对象。
In many cases, special case objects are an easy remedy. Imagine that you have code like this:
> 在许多情况下,特例对象都是爽口良药。设想有这么一段代码:
```java
List employees = getEmployees();
if (employees != null) {
for(Employee e : employees) {
totalPay += e.getPay();
}
}
```
Right now, getEmployees can return null, but does it have to? If we change getEmployee so that it returns an empty list, we can clean up the code:
> 现在,getExployees 可能返回 null,但是否一定要这么做呢?如果修改 getEmployee,返回空列表,就能使代码整洁起来:
```java
List employees = getEmployees();
for(Employee e : employees) {
totalPay += e.getPay();
}
```
Fortunately, Java has Collections.emptyList(), and it returns a predefined immutable list that we can use for this purpose:
> 所幸 Java 有 Collections.emptyList( )方法,该方法返回一个预定义不可变列表,可用于这种目的:
```java
public List getEmployees() {
if( .. there are no employees .. )
return Collections.emptyList();
}
```
If you code this way, you will minimize the chance of NullPointerExceptions and your code will be cleaner.
> 这样编码,就能尽量避免 NullPointerException 的出现,代码也就更整洁了。
## 7.8 DON’T PASS NULL 别传递 null 值
Returning null from methods is bad, but passing null into methods is worse. Unless you are working with an API which expects you to pass null, you should avoid passing null in your code whenever possible.
> 在方法中返回 null 值是糟糕的做法,但将 null 值传递给其他方法就更糟糕了。除非 API 要求你向它传递 null 值,否则就要尽可能避免传递 null 值。
Let’s look at an example to see why. Here is a simple method which calculates a metric for two points:
> 举例说明原因。用下面这个简单的方法计算两点的投射:
```java
public class MetricsCalculator
{
public double xProjection(Point p1, Point p2) {
return (p2.x – p1.x) * 1.5;
}
…
}
```
What happens when someone passes null as an argument?
> 如果有人传入 null 值会怎样?
```java
calculator.xProjection(null, new Point(12, 13));
```
We’ll get a NullPointerException, of course.
> 当然,我们会得到一个 NullPointerException 异常。
How can we fix it? We could create a new exception type and throw it:
> 如何修正?可以创建一个新异常类型并抛出:
```java
public class MetricsCalculator
{
public double xProjection(Point p1, Point p2) {
if (p1 == null || p2 == null) {
throw InvalidArgumentException(
“Invalid argument for MetricsCalculator.xProjection”);
}
return (p2.x – p1.x) * 1.5;
}
}
```
Is this better? It might be a little better than a null pointer exception, but remember, we have to define a handler for InvalidArgumentException. What should the handler do? Is there any good course of action?
> 这样做好些吗?可能比 null 指针异常好一些,但要记住,我们还得为 InvalidArgumentException 异常定义处理器。这个处理器该做什么?还有更好的做法吗?
There is another alternative. We could use a set of assertions:
> 还有替代方案。可以使用一组断言
```java
public class MetricsCalculator
{
public double xProjection(Point p1, Point p2) {
assert p1 != null : “p1 should not be null”;
assert p2 != null : “p2 should not be null”;
return (p2.x – p1.x) * 1.5;
}
}
```
It’s good documentation, but it doesn’t solve the problem. If someone passes null, we’ll still have a runtime error.
> 看上去很美,但仍未解决问题。如果有人传入 null 值,还是会得到运行时错误。
In most programming languages there is no good way to deal with a null that is passed by a caller accidentally. Because this is the case, the rational approach is to forbid passing null by default. When you do, you can code with the knowledge that a null in an argument list is an indication of a problem, and end up with far fewer careless mistakes.
> 在大多数编程语言中,没有良好的方法能对付由调用者意外传入的 null 值。事已如此,恰当的做法就是禁止传入 null 值。这样,你在编码的时候,就会时时记住参数列表中的 null 值意味着出问题了,从而大量避免这种无心之失。
## 7.9 CONCLUSION 小结
Clean code is readable, but it must also be robust. These are not conflicting goals. We can write robust clean code if we see error handling as a separate concern, something that is viewable independently of our main logic. To the degree that we are able to do that, we can reason about it independently, and we can make great strides in the maintainability of our code.
> 整洁代码是可读的,但也要强固。可读与强固并不冲突。如果将错误处理隔离看待,独立于主要逻辑之外,就能写出强固而整洁的代码。做到这一步,我们就能单独处理它,也极大地提升了代码的可维护性。
================================================
FILE: docs/ch8.md
================================================
# 第 8 章 Boundaries 边界
by James Grenning

We seldom control all the software in our systems. Sometimes we buy third-party packages or use open source. Other times we depend on teams in our own company to produce components or subsystems for us. Somehow we must cleanly integrate this foreign code with our own. In this chapter we look at practices and techniques to keep the boundaries of our software clean.
> 我们很少控制系统中的全部软件。有时我们购买第三方程序包或使用开放源代码,有时我们依靠公司中其他团队打造组件或子系统。不管是哪种情况,我们都得将外来代码干净利落地整合进自己的代码中。本章将介绍一些保持软件边界整洁的实践手段和技巧。
## 8.1 USING THIRD-PARTY CODE 使用第三方代码
There is a natural tension between the provider of an interface and the user of an interface. Providers of third-party packages and frameworks strive for broad applicability so they can work in many environments and appeal to a wide audience. Users, on the other hand, want an interface that is focused on their particular needs. This tension can cause problems at the boundaries of our systems.
> 在接口提供者和使用者之间,存在与生俱来的张力。第三方程序包和框架提供者追求普适性,这样就能在多个环境中工作,吸引广泛的用户。而使用者则想要集中满足特定需求的接口。这种张力会导致系统边界上出现问题。
Let’s look at java.util.Map as an example. As you can see by examining Figure 8-1, Maps have a very broad interface with plenty of capabilities. Certainly this power and flexibility is useful, but it can also be a liability. For instance, our application might build up a Map and pass it around. Our intention might be that none of the recipients of our Map delete anything in the map. But right there at the top of the list is the clear() method. Any user of the Map has the power to clear it. Or maybe our design convention is that only particular types of objects can be stored in the Map, but Maps do not reliably constrain the types of objects placed within them. Any determined user can add items of any type to any Map.
> 以 java.util.Map 为例。如你在表 8-1 中所见,Map 有着广阔的接口和丰富的功能。当然,这种力量和灵活性很有用,但也要付出代价。比如,应用程序可能构造一个 Map 对象并传递它。我们的初衷可能是 Map 对象的所有接收者都不要删除映射图中的任何东西。但表 8-1 的顶端却正好有一个 clear( )方法。Map 的任何使用者都能清除映射图。或许设计惯例是 Map 中只能保存特定的类型,但 Map 并不会可靠地约束存于其中的对象的类型。使用者可随意往 Map 中塞入任何类型的条目。
Figure 8-1 The methods of Map

If our application needs a Map of Sensors, you might find the sensors set up like this:
> 如果你的应用程序需要一个包容 Sensor 类对象的 Map 映射图,大概会是这样:
```java
Map sensors = new HashMap();
```
Then, when some other part of the code needs to access the sensor, you see this code:
> 当代码的其他部分需要访问这些 sensor,就会有这行代码:
```java
Sensor s = (Sensor)sensors.get(sensorId );
```
We don’t just see it once, but over and over again throughout the code. The client of this code carries the responsibility of getting an Object from the Map and casting it to the right type. This works, but it’s not clean code. Also, this code does not tell its story as well as it could. The readability of this code can be greatly improved by using generics, as shown below:
> 这行代码一再出现。代码的调用端承担了从 Map 中取得对象并将其转换为正确类型的职责。行倒是行,却并非整洁的代码。而且,这行代码并未说明自己的用途。通过对泛型的使用,这段代码可读性可以大大提高,如下所示:
```java
Map sensors = new HashMap();
…
Sensor s = sensors.get(sensorId );
```
However, this doesn’t solve the problem that `Map` provides more capability than we need or want.
> 不过,`Map` 提供了超出所需/所愿的功能的问题,仍未得到解决。
Passing an instance of `Map` liberally around the system means that there will be a lot of places to fix if the interface to Map ever changes. You might think such a change to be unlikely, but remember that it changed when generics support was added in Java 5. Indeed, we’ve seen systems that are inhibited from using generics because of the sheer magnitude of changes needed to make up for the liberal use of Maps.
> 在系统中不受限制地传递`Map`的实体,意味着当到 Map 的接口被修改时,有许多地方都要跟着改。你或许会认为这样的改动不太可能发生,不过,当 Java 5 加入对泛型的支持时,的确发生了改动。我们也的确见到一些系统因为要做大量改动才能自由使用 Map 类,而无法使用泛型。
A cleaner way to use Map might look like the following. No user of Sensors would care one bit if generics were used or not. That choice has become (and always should be) an implementation detail.
> 使用 Map 的更整洁的方式大致如下。Sensors 的用户不必关心是否用了泛型,那将是(也该是)实现细节才关心的。
```java
public class Sensors {
private Map sensors = new HashMap();
public Sensor getById(String id) {
return (Sensor) sensors.get(id);
}
//snip
}
```
The interface at the boundary (Map) is hidden. It is able to evolve with very little impact on the rest of the application. The use of generics is no longer a big issue because the casting and type management is handled inside the Sensors class.
> 边界上的接口(Map)是隐藏的。它能随来自应用程序其他部分的极小的影响而变动。对泛型的使用不再是个大问题,因为转换和类型管理是在 Sensors 类内部处理的。
This interface is also tailored and constrained to meet the needs of the application. It results in code that is easier to understand and harder to misuse. The Sensors class can enforce design and business rules.
> 该接口也经过仔细修整和归置以适应应用程序的需要。结果就是得到易于理解、难以被误用的代码。Sensors 类推动了设计和业务的规则。我们并不建议总是以这种方式封装 Map 的使用。
We are not suggesting that every use of Map be encapsulated in this form. Rather, we are advising you not to pass Maps (or any other interface at a boundary) around your system. If you use a boundary interface like Map, keep it inside the class, or close family of classes, where it is used. Avoid returning it from, or accepting it as an argument to, public APIs.
> 我们建议不要将 Map(或在边界上的其他接口)在系统中传递。如果你使用类似 Map 这样的边界接口,就把它保留在类或近亲类中。避免从公共 API 中返回边界接口,或将边界接口作为参数传递给公共 API。
## 8.2 EXPLORING AND LEARNING BOUNDARIES 浏览和学习边界
Third-party code helps us get more functionality delivered in less time. Where do we start when we want to utilize some third-party package? It’s not our job to test the third-party code, but it may be in our best interest to write tests for the third-party code we use.
> 第三方代码帮助我们在更少时间内发布更丰富的功能。在利用第三方程序包时,该从何处入手呢?我们没有测试第三方代码的职责,但为要使用的第三方代码编写测试,可能最符合我们的利益。
Suppose it is not clear how to use our third-party library. We might spend a day or two (or more) reading the documentation and deciding how we are going to use it. Then we might write our code to use the third-party code and see whether it does what we think. We would not be surprised to find ourselves bogged down in long debugging sessions trying to figure out whether the bugs we are experiencing are in our code or theirs.
> 设想第三方代码库的使用方法并不清楚。我们可能会花上一两天(或者更多)时间阅读文档,决定如何使用。然后,我们会编写使用第三方代码的代码,看看是否如我们所愿地工作。陷入长时间的调试、找出在我们或他们代码中的缺陷,这可不是什么稀罕事。
Learning the third-party code is hard. Integrating the third-party code is hard too. Doing both at the same time is doubly hard. What if we took a different approach? Instead of experimenting and trying out the new stuff in our production code, we could write some tests to explore our understanding of the third-party code. Jim Newkirk calls such tests learning tests.1
> 学习第三方代码很难。整合第三方代码也很难。同时做这两件事难上加难。如果我们采用不同的做法呢?不要在生产代码中试验新东西,而是编写测试来遍览和理解第三方代码。Jim Newkirk 把这叫做学习性测试(learning tests)1。
1. [BeckTDD], pp. 136–137.
In learning tests we call the third-party API, as we expect to use it in our application. We’re essentially doing controlled experiments that check our understanding of that API. The tests focus on what we want out of the API.
> 在学习性测试中,我们如在应用中那样调用第三方代码。我们基本上是在通过核对试验来检测自己对那个 API 的理解程度。测试聚焦于我们想从 API 得到的东西。
## 8.3 LEARNING LOG4J 学习 log4j
Let’s say we want to use the apache log4j package rather than our own custom-built logger. We download it and open the introductory documentation page. Without too much reading we write our first test case, expecting it to write “hello” to the console.
> 比如,我们想使用 apache log4j 包来代替自定义的日志代码。我们下载了 log4j,打开介绍文档页。无需看太久,就编写了第一个测试用例,希望它能向控制台输出 hello 字样。
```java
@Test
public void testLogCreate() {
Logger logger = Logger.getLogger(“MyLogger”);
logger.info(“hello”);
}
```
When we run it, the logger produces an error that tells us we need something called an Appender. After a little more reading we find that there is a ConsoleAppender. So we create a ConsoleAppender and see whether we have unlocked the secrets of logging to the console.
> 运行,logger 发生了一个错误,告诉我们需要用 Appender。再多读一点文档,我们发现有个 ConsoleAppender。于是我们创建了一个 ConsoleAppender,再看是否能解开向控制台输出日志的秘诀。
```java
@Test
public void testLogAddAppender() {
Logger logger = Logger.getLogger(“MyLogger”);
ConsoleAppender appender = new ConsoleAppender();
logger.addAppender(appender);
logger.info(“hello”);
}
```
This time we find that the Appender has no output stream. Odd—it seems logical that it’d have one. After a little help from Google, we try the following:
> 比如,我们想使用 apache log4j 包来代替自定义的日志代码。我们下载了 log4j,打开介绍文档页。无需看太久,就编写了第一个测试用例,希望它能向控制台输出 hello 字样。
```java
@Test
public void testLogAddAppender() {
Logger logger = Logger.getLogger(“MyLogger”);
logger.removeAllAppenders();
logger.addAppender(new ConsoleAppender(
new PatternLayout(“%p %t %m%n”),
ConsoleAppender.SYSTEM_OUT));
logger.info(“hello”);
}
```
That worked; a log message that includes “hello” came out on the console! It seems odd that we have to tell the ConsoleAppender that it writes to the console.
> 这回行了;hello 字样的日志信息出现在控制台上!必须告知 ConsoleAppender,让它往控制台写字,看起来有点奇怪。
Interestingly enough, when we remove the ConsoleAppender.SystemOut argument, we see that “hello” is still printed. But when we take out the PatternLayout, it once again complains about the lack of an output stream. This is very strange behavior.
> 很有趣,当我们移除 ConsoleAppender.SystemOut 参数时,那个 hello 字样仍然输出到屏幕上。但如果取走 PatternLayout,就会出现关于没有输出流的错误信息。这实在太古怪了。
Looking a little more carefully at the documentation, we see that the default ConsoleAppender constructor is “unconfigured,” which does not seem too obvious or useful. This feels like a bug, or at least an inconsistency, in log4j.
> 再仔细看看文档,我们看到默认的 ConsoleAppender 构造器是“未配置”的,这看起来并不明显或没什么用,反而像是 log4j 的一个缺陷,或者至少是前后不太一致。
A bit more googling, reading, and testing, and we eventually wind up with Listing 8-1. We’ve discovered a great deal about the way that log4j works, and we’ve encoded that knowledge into a set of simple unit tests.
> 再搜索、阅读、测试,最终我们得到代码清单 8-1。我们极大地发掘了 log4j 的工作方式,也将得到的知识融入了一系列简单的单元测试中。
Listing 8-1 LogTest.java
> 代码清单 8-1 LogTest.java
```java
public class LogTest {
private Logger logger;
@Before
public void initialize() {
logger = Logger.getLogger(“logger”);
logger.removeAllAppenders();
Logger.getRootLogger().removeAllAppenders();
}
@Test
public void basicLogger() {
BasicConfigurator.configure();
logger.info(“basicLogger”);
}
@Test
public void addAppenderWithStream() {
logger.addAppender(new ConsoleAppender(
new PatternLayout(“%p %t %m%n”),
ConsoleAppender.SYSTEM_OUT));
logger.info(“addAppenderWithStream”);
}
@Test
public void addAppenderWithoutStream() {
logger.addAppender(new ConsoleAppender(
new PatternLayout(“%p %t %m%n”)));
logger.info(“addAppenderWithoutStream”);
}
}
```
Now we know how to get a simple console logger initialized, and we can encapsulate that knowledge into our own logger class so that the rest of our application is isolated from the log4j boundary interface.
> 现在我们知道如何初始化一个简单的控制台日志器,也能把这些知识封装到自己的日志类中,好将应用程序的其他部分与 log4j 的边界接口隔离开来。
## 8.4 LEARNING TESTS ARE BETTER THAN FREE 学习性测试的好处不只是免费
The learning tests end up costing nothing. We had to learn the API anyway, and writing those tests was an easy and isolated way to get that knowledge. The learning tests were precise experiments that helped increase our understanding.
> 学习性测试毫无成本。无论如何我们都得学习要使用的 API,而编写测试则是获得这些知识的容易而不会影响其他工作的途径。学习性测试是一种精确试验,帮助我们增进对 API 的理解。
Not only are learning tests free, they have a positive return on investment. When there are new releases of the third-party package, we run the learning tests to see whether there are behavioral differences.
> 学习性测试不光免费,还在投资上有正面的回报。当第三方程序包发布了新版本,我们可以运行学习性测试,看看程序包的行为有没有改变。
Learning tests verify that the third-party packages we are using work the way we expect them to. Once integrated, there are no guarantees that the third-party code will stay compatible with our needs. The original authors will have pressures to change their code to meet new needs of their own. They will fix bugs and add new capabilities. With each release comes new risk. If the third-party package changes in some way incompatible with our tests, we will find out right away.
> 学习性测试确保第三方程序包按照我们想要的方式工作。一旦整合进来,就不能保证第三方代码总与我们的需要兼容。原作者不得不修改代码来满足他们自己的新需要。他们会修正缺陷、添加新功能。风险伴随新版本而来。如果第三方程序包的修改与测试不兼容,我们也能马上发现。
Whether you need the learning provided by the learning tests or not, a clean boundary should be supported by a set of outbound tests that exercise the interface the same way the production code does. Without these boundary tests to ease the migration, we might be tempted to stay with the old version longer than we should.
> 无论你是否需要通过学习性测试来学习,总要有一系列与生产代码中调用方式一致的输出测试来支持整洁的边界。不使用这些边界测试来减轻迁移的劳力,我们可能会超出应有时限,长久地绑在旧版本上面。
## 8.5 USING CODE THAT DOES NOT YET EXIST 使用尚不存在的代码
There is another kind of boundary, one that separates the known from the unknown. There are often places in the code where our knowledge seems to drop off the edge. Sometimes what is on the other side of the boundary is unknowable (at least right now). Sometimes we choose to look no farther than the boundary.
> 还有另一种边界,那种将已知和未知分隔开的边界。在代码中总有许多地方是我们的知识未及之处。有时,边界那边就是未知的(至少目前未知)。有时,我们并不往边界那边看过去。
A number of years back I was part of a team developing software for a radio communications system. There was a subsystem, the “Transmitter,” that we knew little about, and the people responsible for the subsystem had not gotten to the point of defining their interface. We did not want to be blocked, so we started our work far away from the unknown part of the code.
> 好多年以前,我曾在一个开发无线通信系统软件的团队中工作。该系统有个子系统 Transmitter(发送机)。我们对 Transmitter 知之甚少,而该子系统的开发者还没有对接口进行定义。我们不想受这种事阻碍,就从距未知那部分代码很远处开始工作。
We had a pretty good idea of where our world ended and the new world began. As we worked, we sometimes bumped up against this boundary. Though mists and clouds of ignorance obscured our view beyond the boundary, our work made us aware of what we wanted the boundary interface to be. We wanted to tell the transmitter something like this:
> 对于我们的世界如何结束、新世界如何开始,我们有许多好主意。工作时,我们偶尔会跨越那道边界。尽管云雾遮挡了我们看向边界那边的视线,我们还是从工作中了解到我们想要的边界接口是什么样的。我们想要告知发送机一些事:
Key the transmitter on the provided frequency and emit an analog representation of the data coming from this stream.
> 对于我们的世界如何结束、新世界如何开始,我们有许多好主意。工作时,我们偶尔会跨越那道边界。
We had no idea how that would be done because the API had not been designed yet. So we decided to work out the details later.
> 尽管云雾遮挡了我们看向边界那边的视线,我们还是从工作中了解到我们想要的边界接口是什么样的。我们想要告知发送机一些事:
To keep from being blocked, we defined our own interface. We called it something catchy, like Transmitter. We gave it a method called transmit that took a frequency and a data stream. This was the interface we wished we had.
> 为了不受阻碍,我们定义了自己使用的接口。我们给它取了个好记的名字,比如 Transmitter。我们给它写了个名为 transmit 的方法,获取频率参数和数据流。这就是我们希望得到的接口。
One good thing about writing the interface we wish we had is that it’s under our control. This helps keep client code more readable and focused on what it is trying to accomplish.
> 编写我们想得到的接口,好处之一是它在我们控制之下。这有助于保持客户代码更可读,且集中于它该完成的工作。
In Figure 8-2, you can see that we insulated the CommunicationsController classes from the transmitter API (which was out of our control and undefined). By using our own application specific interface, we kept our CommunicationsController code clean and expressive. Once the transmitter API was defined, we wrote the TransmitterAdapter to bridge the gap. The ADAPTER2 encapsulated the interaction with the API and provides a single place to change when the API evolves.
> 在图 8-2 中可以看到,我们将 CommunicationsController 类从发送器 API(该 API 不受我们控制,而且还没定义)中隔离出来。通过使用符合应用程序的接口,CommunicationsController 代码整洁且足以表达其意图。一旦发送器 API 被定义出来,我们就编写 TransmitterAdapter 来跨接。ADAPTER[1]封装了与 API 的互动,也提供了一个当 API 发生变动时唯一需要改动的地方。
2. See the Adapter pattern in [GOF].
Figure 8-2 Predicting the transmitter

This design also gives us a very convenient seam3 in the code for testing. Using a suitable FakeTransmitter, we can test the CommunicationsController classes. We can also create boundary tests once we have the TransmitterAPI that make sure we are using the API correctly.
> 这套设计方案为测试提供了一种极为方便的接缝[2]。使用适当的 FakeTransmitter,我们就能测试 CommunicationsController 类。在拿到 TransmitterAPI 时,我们也能创建确保正确使用 API 的边界测试。
3. See more about seams in [WELC].
## 8.6 CLEAN BOUNDARIES 整洁的边界
Interesting things happen at boundaries. Change is one of those things. Good software designs accommodate change without huge investments and rework. When we use code that is out of our control, special care must be taken to protect our investment and make sure future change is not too costly.
> 边界上会发生有趣的事。改动是其中之一。有良好的软件设计,无需巨大投入和重写即可进行修改。在使用我们控制不了的代码时,必须加倍小心保护投资,确保未来的修改不至于代价太大。
Code at the boundaries needs clear separation and tests that define expectations. We should avoid letting too much of our code know about the third-party particulars. It’s better to depend on something you control than on something you don’t control, lest it end up controlling you.
> 边界上的代码需要清晰的分割和定义了期望的测试。应该避免我们的代码过多地了解第三方代码中的特定信息。依靠你能控制的东西,好过依靠你控制不了的东西,免得日后受它控制。
We manage third-party boundaries by having very few places in the code that refer to them. We may wrap them as we did with Map, or we may use an ADAPTER to convert from our perfect interface to the provided interface. Either way our code speaks to us better, promotes internally consistent usage across the boundary, and has fewer maintenance points when the third-party code changes.
> 我们通过代码中少数几处引用第三方边界接口的位置来管理第三方边界。可以像我们对待 Map 那样包装它们,也可以使用 ADAPTER 模式将我们的接口转换为第三方提供的接口。采用这两种方式,代码都能更好地与我们沟通,在边界两边推动内部一致的用法,当第三方代码有改动时修改点也会更少。
================================================
FILE: docs/ch9.md
================================================
# 第 9 章 Unit Tests

Our profession has come a long way in the last ten years. In 1997 no one had heard of Test Driven Development. For the vast majority of us, unit tests were short bits of throw-away code that we wrote to make sure our programs “worked.” We would painstakingly write our classes and methods, and then we would concoct some ad hoc code to test them. Typically this would involve some kind of simple driver program that would allow us to manually interact with the program we had written.
I remember writing a C++ program for an embedded real-time system back in the mid-90s. The program was a simple timer with the following signature:
```cpp
void Timer::ScheduleCommand(Command* theCommand, int milliseconds)
```
The idea was simple; the execute method of the Command would be executed in a new thread after the specified number of milliseconds. The problem was, how to test it.
I cobbled together a simple driver program that listened to the keyboard. Every time a character was typed, it would schedule a command that would type the same character five seconds later. Then I tapped out a rhythmic melody on the keyboard and waited for that melody to replay on the screen five seconds later.
“I … want-a-girl … just … like-the-girl-who-marr … ied … dear … old … dad.”
I actually sang that melody while typing the “.” key, and then I sang it again as the dots appeared on the screen.
That was my test! Once I saw it work and demonstrated it to my colleagues, I threw the test code away.
As I said, our profession has come a long way. Nowadays I would write a test that made sure that every nook and cranny of that code worked as I expected it to. I would isolate my code from the operating system rather than just calling the standard timing functions. I would mock out those timing functions so that I had absolute control over the time. I would schedule commands that set boolean flags, and then I would step the time forward, watching those flags and ensuring that they went from false to true just as I changed the time to the right value.
Once I got a suite of tests to pass, I would make sure that those tests were convenient to run for anyone else who needed to work with the code. I would ensure that the tests and the code were checked in together into the same source package.
Yes, we’ve come a long way; but we have farther to go. The Agile and TDD movements have encouraged many programmers to write automated unit tests, and more are joining their ranks every day. But in the mad rush to add testing to our discipline, many programmers have missed some of the more subtle, and important, points of writing good tests.
THE THREE LAWS OF TDD
By now everyone knows that TDD asks us to write unit tests first, before we write production code. But that rule is just the tip of the iceberg. Consider the following three laws:1
1. Professionalism and Test-Driven Development, Robert C. Martin, Object Mentor, IEEE Software, May/June 2007 (Vol. 24, No. 3) pp. 32–36
http://doi.ieeecomputersociety.org/10.1109/MS.2007.85
First Law You may not write production code until you have written a failing unit test.
Second Law You may not write more of a unit test than is sufficient to fail, and not compiling is failing.
Third Law You may not write more production code than is sufficient to pass the currently failing test.
These three laws lock you into a cycle that is perhaps thirty seconds long. The tests and the production code are written together, with the tests just a few seconds ahead of the production code.
If we work this way, we will write dozens of tests every day, hundreds of tests every month, and thousands of tests every year. If we work this way, those tests will cover virtually all of our production code. The sheer bulk of those tests, which can rival the size of the production code itself, can present a daunting management problem.
KEEPING TESTS CLEAN
Some years back I was asked to coach a team who had explicitly decided that their test code should not be maintained to the same standards of quality as their production code. They gave each other license to break the rules in their unit tests. “Quick and dirty” was the watchword. Their variables did not have to be well named, their test functions did not need to be short and descriptive. Their test code did not need to be well designed and thoughtfully partitioned. So long as the test code worked, and so long as it covered the production code, it was good enough.
Some of you reading this might sympathize with that decision. Perhaps, long in the past, you wrote tests of the kind that I wrote for that Timer class. It’s a huge step from writing that kind of throw-away test, to writing a suite of automated unit tests. So, like the team I was coaching, you might decide that having dirty tests is better than having no tests.
What this team did not realize was that having dirty tests is equivalent to, if not worse than, having no tests. The problem is that tests must change as the production code evolves. The dirtier the tests, the harder they are to change. The more tangled the test code, the more likely it is that you will spend more time cramming new tests into the suite than it takes to write the new production code. As you modify the production code, old tests start to fail, and the mess in the test code makes it hard to get those tests to pass again. So the tests become viewed as an ever-increasing liability.
From release to release the cost of maintaining my team’s test suite rose. Eventually it became the single biggest complaint among the developers. When managers asked why their estimates were getting so large, the developers blamed the tests. In the end they were forced to discard the test suite entirely.
But, without a test suite they lost the ability to make sure that changes to their code base worked as expected. Without a test suite they could not ensure that changes to one part of their system did not break other parts of their system. So their defect rate began to rise. As the number of unintended defects rose, they started to fear making changes. They stopped cleaning their production code because they feared the changes would do more harm than good. Their production code began to rot. In the end they were left with no tests, tangled and bug-riddled production code, frustrated customers, and the feeling that their testing effort had failed them.
In a way they were right. Their testing effort had failed them. But it was their decision to allow the tests to be messy that was the seed of that failure. Had they kept their tests clean, their testing effort would not have failed. I can say this with some certainty because I have participated in, and coached, many teams who have been successful with clean unit tests.
The moral of the story is simple: Test code is just as important as production code. It is not a second-class citizen. It requires thought, design, and care. It must be kept as clean as production code.
Tests Enable the -ilities
If you don’t keep your tests clean, you will lose them. And without them, you lose the very thing that keeps your production code flexible. Yes, you read that correctly. It is unit tests that keep our code flexible, maintainable, and reusable. The reason is simple. If you have tests, you do not fear making changes to the code! Without tests every change is a possible bug. No matter how flexible your architecture is, no matter how nicely partitioned your design, without tests you will be reluctant to make changes because of the fear that you will introduce undetected bugs.
But with tests that fear virtually disappears. The higher your test coverage, the less your fear. You can make changes with near impunity to code that has a less than stellar architecture and a tangled and opaque design. Indeed, you can improve that architecture and design without fear!
So having an automated suite of unit tests that cover the production code is the key to keeping your design and architecture as clean as possible. Tests enable all the -ilities, because tests enable change.
So if your tests are dirty, then your ability to change your code is hampered, and you begin to lose the ability to improve the structure of that code. The dirtier your tests, the dirtier your code becomes. Eventually you lose the tests, and your code rots.
CLEAN TESTS
What makes a clean test? Three things. Readability, readability, and readability. Readability is perhaps even more important in unit tests than it is in production code. What makes tests readable? The same thing that makes all code readable: clarity, simplicity, and density of expression. In a test you want to say a lot with as few expressions as possible.
Consider the code from FitNesse in Listing 9-1. These three tests are difficult to understand and can certainly be improved. First, there is a terrible amount of duplicate code [G5] in the repeated calls to addPage and assertSubString. More importantly, this code is just loaded with details that interfere with the expressiveness of the test.
Listing 9-1 SerializedPageResponderTest.java
```java
public void testGetPageHieratchyAsXml() throws Exception
{
crawler.addPage(root, PathParser.parse(“PageOne”));
crawler.addPage(root, PathParser.parse(“PageOne.ChildOne”));
crawler.addPage(root, PathParser.parse(“PageTwo”));
request.setResource(“root”);
request.addInput(“type”, “pages”);
Responder responder = new SerializedPageResponder();
SimpleResponse response =
(SimpleResponse) responder.makeResponse(
new FitNesseContext(root), request);
String xml = response.getContent();
assertEquals(“text/xml”, response.getContentType());
assertSubString(“PageOne ”, xml);
assertSubString(“PageTwo ”, xml);
assertSubString(“ChildOne ”, xml);
}
public void testGetPageHieratchyAsXmlDoesntContainSymbolicLinks()
throws Exception {
WikiPage pageOne = crawler.addPage(root, PathParser.parse(“PageOne”));
crawler.addPage(root, PathParser.parse(“PageOne.ChildOne”));
crawler.addPage(root, PathParser.parse(“PageTwo”));
PageData data = pageOne.getData();
WikiPageProperties properties = data.getProperties();
WikiPageProperty symLinks = properties.set(SymbolicPage.PROPERTY_NAME);
symLinks.set(“SymPage”, ”PageTwo”);
pageOne.commit(data);
request.setResource(“root”);
request.addInput(“type”, ”pages”);
Responder responder = new SerializedPageResponder();
SimpleResponse response =
(SimpleResponse) responder.makeResponse(
new FitNesseContext(root), request);
String xml = response.getContent();
assertEquals(“text/xml”, response.getContentType());
assertSubString(“PageOne ”, xml);
assertSubString(“PageTwo ”, xml);
assertSubString(“ChildOne ”, xml);
assertNotSubString(“SymPage”, xml);
}
public void testGetDataAsHtml() throws Exception
{
crawler.addPage(root, PathParser.parse(“TestPageOne”), ”test page”);
request.setResource(“TestPageOne”);
request.addInput(“type”, ”data”);
Responder responder = new SerializedPageResponder();
SimpleResponse response =
(SimpleResponse) responder.makeResponse(
new FitNesseContext(root), request);
String xml = response.getContent();
assertEquals(“text/xml”, response.getContentType());
assertSubString(“test page”, xml);
assertSubString(“PageOne”, “PageTwo ”, “ChildOne ”
);
}
public void testSymbolicLinksAreNotInXmlPageHierarchy() throws Exception {
WikiPage page = makePage(“PageOne”);
makePages(“PageOne.ChildOne”, “PageTwo”);
addLinkTo(page, “PageTwo”, “SymPage”);
submitRequest(“root”, “type:pages”);
assertResponseIsXML();
assertResponseContains(
“PageOne ”, “PageTwo ”,
“ChildOne ”
);
assertResponseDoesNotContain(“SymPage”);
}
public void testGetDataAsXml() throws Exception {
makePageWithContent(“TestPageOne”, “test page”);
submitRequest(“TestPageOne”, “type:data”);
assertResponseIsXML();
assertResponseContains(“test page”, “PageOne”, “PageTwo ”, “ChildOne ”
);
}
```
Notice that I have changed the names of the functions to use the common given-when-then5 convention. This makes the tests even easier to read. Unfortunately, splitting the tests as shown results in a lot of duplicate code.
5. [RSpec].
We can eliminate the duplication by using the TEMPLATE METHOD6 pattern and putting the given/when parts in the base class, and the then parts in different derivatives. Or we could create a completely separate test class and put the given and when parts in the @Before function, and the when parts in each @Test function. But this seems like too much mechanism for such a minor issue. In the end, I prefer the multiple asserts in Listing 9-2.
6. [GOF].
I think the single assert rule is a good guideline.7 I usually try to create a domain-specific testing language that supports it, as in Listing 9-5. But I am not afraid to put more than one assert in a test. I think the best thing we can say is that the number of asserts in a test ought to be minimized.
7. “Keep to the code!”
Single Concept per Test
Perhaps a better rule is that we want to test a single concept in each test function. We don’t want long test functions that go testing one miscellaneous thing after another. Listing 9-8 is an example of such a test. This test should be split up into three independent tests because it tests three independent things. Merging them all together into the same function forces the reader to figure out why each section is there and what is being tested by that section.
Listing 9-8
```java
/**
* Miscellaneous tests for the addMonths() method.
*/
public void testAddMonths() {
SerialDate d1 = SerialDate.createInstance(31, 5, 2004);
SerialDate d2 = SerialDate.addMonths(1, d1);
assertEquals(30, d2.getDayOfMonth());
assertEquals(6, d2.getMonth());
assertEquals(2004, d2.getYYYY());
SerialDate d3 = SerialDate.addMonths(2, d1);
assertEquals(31, d3.getDayOfMonth());
assertEquals(7, d3.getMonth());
assertEquals(2004, d3.getYYYY());
SerialDate d4 = SerialDate.addMonths(1, SerialDate.addMonths(1, d1));
assertEquals(30, d4.getDayOfMonth());
assertEquals(7, d4.getMonth());
assertEquals(2004, d4.getYYYY());
}
```
The three test functions probably ought to be like this:
- Given the last day of a month with 31 days (like May):
1. When you add one month, such that the last day of that month is the 30th (like June), then the date should be the 30th of that month, not the 31st.
2. When you add two months to that date, such that the final month has 31 days, then the date should be the 31st.
- Given the last day of a month with 30 days in it (like June):
1. When you add one month such that the last day of that month has 31 days, then the date should be the 30th, not the 31st.
Stated like this, you can see that there is a general rule hiding amidst the miscellaneous tests. When you increment the month, the date can be no greater than the last day of the month. This implies that incrementing the month on February 28th should yield March 28th. That test is missing and would be a useful test to write.
So it’s not the multiple asserts in each section of Listing 9-8 that causes the problem. Rather it is the fact that there is more than one concept being tested. So probably the best rule is that you should minimize the number of asserts per concept and test just one concept per test function.
F.I.R.S.T.8
8. Object Mentor Training Materials.
Clean tests follow five other rules that form the above acronym:
Fast Tests should be fast. They should run quickly. When tests run slow, you won’t want to run them frequently. If you don’t run them frequently, you won’t find problems early enough to fix them easily. You won’t feel as free to clean up the code. Eventually the code will begin to rot.
Independent Tests should not depend on each other. One test should not set up the conditions for the next test. You should be able to run each test independently and run the tests in any order you like. When tests depend on each other, then the first one to fail causes a cascade of downstream failures, making diagnosis difficult and hiding downstream defects.
Repeatable Tests should be repeatable in any environment. You should be able to run the tests in the production environment, in the QA environment, and on your laptop while riding home on the train without a network. If your tests aren’t repeatable in any environment, then you’ll always have an excuse for why they fail. You’ll also find yourself unable to run the tests when the environment isn’t available.
Self-Validating The tests should have a boolean output. Either they pass or fail. You should not have to read through a log file to tell whether the tests pass. You should not have to manually compare two different text files to see whether the tests pass. If the tests aren’t self-validating, then failure can become subjective and running the tests can require a long manual evaluation.
Timely The tests need to be written in a timely fashion. Unit tests should be written just before the production code that makes them pass. If you write tests after the production code, then you may find the production code to be hard to test. You may decide that some production code is too hard to test. You may not design the production code to be testable.
CONCLUSION
We have barely scratched the surface of this topic. Indeed, I think an entire book could be written about clean tests. Tests are as important to the health of a project as the production code is. Perhaps they are even more important, because tests preserve and enhance the flexibility, maintainability, and reusability of the production code. So keep your tests constantly clean. Work to make them expressive and succinct. Invent testing APIs that act as domain-specific language that helps you write the tests.
If you let the tests rot, then your code will rot too. Keep your tests clean.
================================================
FILE: gitee-deploy.sh
================================================
#!/usr/bin/env sh
# abort on errors
set -e
# build
yarn docs:build
# navigate into the build output directory
cd docs/.vuepress/dist
# if you are deploying to a custom domain
echo 'http://gdut_yy.gitee.io/doc-cleancode/' > CNAME
git init
git add -A
git commit -m 'deploy'
# if you are deploying to https://.github.io
git push -f git@gitee.com:gdut_yy/doc-cleancode.git master
# if you are deploying to https://.github.io/
# git push -f git@github.com:/.git master:gh-pages
cd -
================================================
FILE: package.json
================================================
{
"scripts": {
"docs:dev": "vuepress dev docs",
"docs:build": "vuepress build docs"
}
}
| | |