艺术 « 邵珠庆の日记

273月/13关

代码为什么要易于理解

“Code should be written to minimize the time it would take for someone else to understand it.”

日常工作的事实是：

写代码前的思考和看代码的时间远大于真正写的时间
读代码是很平常的事情，不论是别人的，还是自己的，半年前写的可认为是别人的代码
代码可读性高，很快就可以理解程序的逻辑，进入工作状态
行数少的代码不一定就容易理解
代码的可读性与程序的效率、架构、易于测试一点也不冲突

整本书都围绕“如何让代码的可读性更高”这个目标来写。这也是好代码的重要标准之一。

如何命名

变量名中应包含更多信息

使用含义明确的词，比如用`download`而不是`get`，参考以下替换方案：

1

2

3

4

send -> deliver, dispatch, announce, distribute, route
 find -> search, extract, locate, recover
start -> lanuch, create, begin, open
 make -> create,set up, build, generate, compose, add, new

避免通用的词

像tmp和retval这样词，除了说明是临时变量和返回值之外，没有任何意义。但是给他加一些有意义的词，就会很明确：

1

2

3

tmp_file = tempfile.NamedTemporaryFile() 
...
SaveData(tmp_file, ...)

不使用retval而使用变量真正代表的意义：

1	`sum_squares += v[i];` `// Where's the "square" that we're summing? Bug!`

嵌套的for循环中,i、j也有同样让人困惑的时候：

1

2

3

4

for (int i = 0; i < clubs.size(); i++)
 for (int j = 0; j < clubs[i].members.size(); j++)
 for (int k = 0; k < users.size(); k++) if (clubs[i].members[k] == users[j])
 cout << "user[" << j << "] is in club[" << i << "]" << endl;

换一种写法就会清晰很多：

1	`if` `(clubs[ci].members[mi] == users[ui]) # OK. First letters match.`

所以，当使用一些通用的词，要有充分的理由才可以。

使用具体的名字

CanListenOnPort就比ServerCanStart好，can start比较含糊，而listen on port确切的说明了这个方法将要做什么。

--run_locally就不如--extra_logging来的明确。

增加重要的细节，比如变量的单位`_ms`，对原始字符串加`_raw`

如果一个变量很重要，那么在名字上多加一些额外的字就会更加易读，比如将string id; // Example: "af84ef845cd8"换成string hex_id;。

1

2

3

4

Start(int delay) --> delay → delay_secs
 CreateCache(int size) --> size → size_mb
ThrottleDownload(float limit) --> limit → max_kbps
 Rotate(float angle) --> angle → degrees_cw

更多例子：

1

2

3

4

password -> plaintext_password
 comment -> unescaped_comment
 html -> html_utf8
 data -> data_urlenc

对于作用域大的变量使用较长的名字

在比较小的作用域内，可以使用较短的变量名，在较大的作用域内使用的变量，最好用长一点的名字，编辑器的自动补全都可以很好的减少键盘输入。对于一些缩写前缀，尽量选择众所周知的(如str)，一个判断标准是，当新成员加入时，是否可以无需他人帮助而明白前缀代表什么。

合理使用`_`、`-`等符号，比如对私有变量加`_`前缀。

var x = new DatePicker(); // DatePicker() 是类的"构造"函数，大写开始
var y = pageHeight(); // pageHeight() 是一个普通函数
 
var $all_images = $("img"); // $all_images 是jQuery对象
var height = 250; // height不是
 
//id和class的写法分开
<div id="middle_column" class="main-content"> ...

命名不能有歧义

命名的时候可以先想一下，我要用的这个词是否有别的含义。举个例子：

1	`results = Database.all_objects.filter("year <= 2011")`

现在的结果到底是包含2011年之前的呢还是不包含呢？

使用`min`、`max`代替`limit`

1

2

3

4

5

6

7

CART_TOO_BIG_LIMIT = 10
 if shopping_cart.num_items() >= CART_TOO_BIG_LIMIT:
 Error("Too many items in cart.")
 
MAX_ITEMS_IN_CART = 10
 if shopping_cart.num_items() > MAX_ITEMS_IN_CART:
 Error("Too many items in cart.")

对比上例中CART_TOO_BIG_LIMIT和MAX_ITEMS_IN_CART，想想哪个更好呢？

使用`first`和`last`来表示闭区间

1

2

3

4

print integer_range(start=2, stop=4)
# Does this print [2,3] or [2,3,4] (or something else)?
 
set.PrintKeys(first="Bart", last="Maggie")

first和last含义明确，适宜表示闭区间。

使用`beigin`和`end`表示前闭后开(2,9))区间

1

2

3

PrintEventsInRange("OCT 16 12:00am", "OCT 17 12:00am")
 
PrintEventsInRange("OCT 16 12:00am", "OCT 16 11:59:59.9999pm")

上面一种写法就比下面的舒服多了。

Boolean型变量命名

1	`bool read_password =` `true;`

这是一个很危险的命名，到底是需要读取密码呢，还是密码已经被读取呢，不知道，所以这个变量可以使用user_is_authenticated代替。通常，给Boolean型变量添加is、has、can、should可以让含义更清晰，比如：

1 2	`SpaceLeft() --> hasSpaceLeft()` `bool disable_ssl =` `false` `--> bool use_ssl =` `true`

符合预期

1

2

3

4

5

6

7

public class StatisticsCollector {
 public void addSample(double x) { ... }
 public double getMean() {
 // Iterate through all samples and return total / num_samples
 }
 ...
}

在这个例子中，getMean方法遍历了所有的样本，返回总额，所以并不是普通意义上轻量的get方法，所以应该取名computeMean比较合适。

漂亮的格式

写出来漂亮的格式，充满美感，读起来自然也会舒服很多，对比下面两个例子：

class StatsKeeper {
 public:
 // A class for keeping track of a series of doubles
 void Add(double d); // and methods for quick statistics about them
 private: int count; /* how many so far
 */ public:
 double Average();
 private: double minimum;
 list<double>
 past_items
 ;double maximum;
};

什么是充满美感的呢：

// A class for keeping track of a series of doubles
// and methods for quick statistics about them.
class StatsKeeper {
 public:
 void Add(double d);
 double Average();
 private:
 list<double> past_items;
 int count; // how many so far
 double minimum;
 double maximum;
};

考虑断行的连续性和简洁

这段代码需要断行，来满足不超过一行80个字符的要求，参数也需要注释说明：

public class PerformanceTester {
 public static final TcpConnectionSimulator wifi = new TcpConnectionSimulator(
 500, /* Kbps */
 80, /* millisecs latency */
 200, /* jitter */
 1 /* packet loss % */);
 
 public static final TcpConnectionSimulator t3_fiber = new TcpConnectionSimulator(
 45000, /* Kbps */
 10, /* millisecs latency */
 0, /* jitter */
 0 /* packet loss % */);
 
 public static final TcpConnectionSimulator cell = new TcpConnectionSimulator(
 100, /* Kbps */
 400, /* millisecs latency */
 250, /* jitter */
 5 /* packet loss % */);
}

考虑到代码的连贯性，先优化成这样：

public class PerformanceTester {
 public static final TcpConnectionSimulator wifi =
 new TcpConnectionSimulator(
 500, /* Kbps */
 80, /* millisecs latency */ 200, /* jitter */
 1 /* packet loss % */);
 
 public static final TcpConnectionSimulator t3_fiber =
 new TcpConnectionSimulator(
 45000, /* Kbps */
 10, /* millisecs latency */
 0, /* jitter */
 0 /* packet loss % */);
 
 public static final TcpConnectionSimulator cell =
 new TcpConnectionSimulator(
 100, /* Kbps */
 400, /* millisecs latency */
 250, /* jitter */
 5 /* packet loss % */);
}

连贯性好一点，但还是太罗嗦，额外占用很多空间：

public class PerformanceTester {
 // TcpConnectionSimulator(throughput, latency, jitter, packet_loss)
 // [Kbps] [ms] [ms] [percent]
 public static final TcpConnectionSimulator wifi =
 new TcpConnectionSimulator(500, 80, 200, 1);
 
 public static final TcpConnectionSimulator t3_fiber =
 new TcpConnectionSimulator(45000, 10, 0, 0);
 
 public static final TcpConnectionSimulator cell =
 new TcpConnectionSimulator(100, 400, 250, 5);
}

用函数封装

// Turn a partial_name like "Doug Adams" into "Mr. Douglas Adams".
// If not possible, 'error' is filled with an explanation.
string ExpandFullName(DatabaseConnection dc, string partial_name, string* error);
 
DatabaseConnection database_connection;
string error;
assert(ExpandFullName(database_connection, "Doug Adams", &error)
 == "Mr. Douglas Adams");
assert(error == "");
assert(ExpandFullName(database_connection, " Jake Brown ", &error)
 == "Mr. Jacob Brown III");
assert(error == "");
assert(ExpandFullName(database_connection, "No Such Guy", &error) == "");
assert(error == "no match found");
assert(ExpandFullName(database_connection, "John", &error) == "");
assert(error == "more than one result");

上面这段代码看起来很脏乱，很多重复性的东西，可以用函数封装：

CheckFullName("Doug Adams", "Mr. Douglas Adams", "");
CheckFullName(" Jake Brown ", "Mr. Jake Brown III", "");
CheckFullName("No Such Guy", "", "no match found");
CheckFullName("John", "", "more than one result");
 
void CheckFullName(string partial_name,
 string expected_full_name,
 string expected_error) {
 // database_connection is now a class member
 string error;
 string full_name = ExpandFullName(database_connection, partial_name, &error);
 assert(error == expected_error);
 assert(full_name == expected_full_name);
}

列对齐

列对齐可以让代码段看起来更舒适：

CheckFullName("Doug Adams" , "Mr. Douglas Adams" , "");
CheckFullName(" Jake Brown ", "Mr. Jake Brown III", "");
CheckFullName("No Such Guy" , "" , "no match found");
CheckFullName("John" , "" , "more than one result");
 
commands[] = {
 ...
 { "timeout" , NULL , cmd_spec_timeout},
 { "timestamping" , &opt.timestamping , cmd_boolean},
 { "tries" , &opt.ntry , cmd_number_inf},
 { "useproxy" , &opt.use_proxy , cmd_boolean},
 { "useragent" , NULL , cmd_spec_useragent},
 ...
};

代码用块区分

class FrontendServer {
 public:
 FrontendServer();
 void ViewProfile(HttpRequest* request);
 void OpenDatabase(string location, string user);
 void SaveProfile(HttpRequest* request);
 string ExtractQueryParam(HttpRequest* request, string param);
 void ReplyOK(HttpRequest* request, string html);
 void FindFriends(HttpRequest* request);
 void ReplyNotFound(HttpRequest* request, string error);
 void CloseDatabase(string location);
 ~FrontendServer();
};

上面这一段虽然能看，不过还有优化空间：

class FrontendServer {
 public:
 FrontendServer();
 ~FrontendServer();
 // Handlers
 void ViewProfile(HttpRequest* request);
 void SaveProfile(HttpRequest* request);
 void FindFriends(HttpRequest* request);
 
 // Request/Reply Utilities
 string ExtractQueryParam(HttpRequest* request, string param);
 void ReplyOK(HttpRequest* request, string html);
 void ReplyNotFound(HttpRequest* request, string error);
 
 // Database Helpers
 void OpenDatabase(string location, string user);
 void CloseDatabase(string location);
};

再来看一段代码：

# Import the user's email contacts, and match them to users in our system.
# Then display a list of those users that he/she isn't already friends with.
def suggest_new_friends(user, email_password):
 friends = user.friends()
 friend_emails = set(f.email for f in friends)
 contacts = import_contacts(user.email, email_password)
 contact_emails = set(c.email for c in contacts)
 non_friend_emails = contact_emails - friend_emails
 suggested_friends = User.objects.select(email__in=non_friend_emails)
 display['user'] = user
 display['friends'] = friends
 display['suggested_friends'] = suggested_friends
 return render("suggested_friends.html", display)

全都混在一起，视觉压力相当大，按功能化块：

def suggest_new_friends(user, email_password):
 # Get the user's friends' email addresses.
 friends = user.friends()
 friend_emails = set(f.email for f in friends)
 
 # Import all email addresses from this user's email account.
 contacts = import_contacts(user.email, email_password)
 contact_emails = set(c.email for c in contacts)
 
 # Find matching users that they aren't already friends with.
 non_friend_emails = contact_emails - friend_emails
 suggested_friends = User.objects.select(email__in=non_friend_emails)
 
 # Display these lists on the page. display['user'] = user
 display['friends'] = friends
 display['suggested_friends'] = suggested_friends
 
 return render("suggested_friends.html", display)

让代码看起来更舒服，需要在写的过程中多注意，培养一些好的习惯，尤其当团队合作的时候，代码风格比如大括号的位置并没有对错，但是不遵循团队规范那就是错的。

如何写注释

当你写代码的时候，你会思考很多，但是最终呈现给读者的就只剩代码本身了，额外的信息丢失了，所以注释的目的就是让读者了解更多的信息。

应该注释什么

不应该注释什么

这样的注释毫无价值：

// The class definition for Account
class Account {
 public:
 // Constructor
 Account();
 // Set the profit member to a new value
 void SetProfit(double profit);
 // Return the profit from this Account
 double GetProfit();
};

不要像下面这样为了注释而注释：

1

2

3

4

// Find a Node with the given 'name' or return NULL.
// If depth <= 0, only 'subtree' is inspected.
// If depth == N, only 'subtree' and N levels below are inspected.
Node* FindNodeInSubtree(Node* subtree, string name, int depth);

不要给烂取名注释

1

2

3

// Enforce limits on the Reply as stated in the Request,
// such as the number of items returned, or total byte size, etc. 
void CleanReply(Request request, Reply reply);

注释的大部分都在解释clean是什么意思，那不如换个正确的名字：

1 2	`// Make sure 'reply' meets the count/byte/etc. limits from the 'request'` `void` `EnforceLimitsFromRequest(Request request, Reply reply);`

记录你的想法

我们讨论了不该注释什么，那么应该注释什么呢？注释应该记录你思考代码怎么写的结果，比如像下面这些：

1

2

3

4

5

6

7

// Surprisingly, a binary tree was 40% faster than a hash table for this data.
// The cost of computing a hash was more than the left/right comparisons.
 
// This heuristic might miss a few words. That's OK; solving this 100% is hard.
 
// This class is getting messy. Maybe we should create a 'ResourceNode' subclass to
// help organize things.

也可以用来记录流程和常量：

1

2

3

4

5

6

7

// TODO: use a faster algorithm
// TODO(dustin): handle other image formats besides JPEG
 
NUM_THREADS = 8 # as long as it's >= 2 * num_processors, that's good enough.
 
// Impose a reasonable limit - no human can read that much anyway.
const int MAX_RSS_SUBSCRIPTIONS = 1000;

可用的词有：

TODO : Stuff I haven’t gotten around to yet
FIXME : Known-broken code here
HACK : Adimittedly inelegant solution to a problem
XXX : Danger! Major problem here

站在读者的角度去思考

当别人读你的代码时，让他们产生疑问的部分，就是你应该注释的地方。

1

2

3

4

5

6

7

struct Recorder {
 vector<float> data;
 ...
 void Clear() {
 vector<float>().swap(data); // Huh? Why not just data.clear()? 
 }
};

很多C++的程序员啊看到这里，可能会想为什么不用data.clear()来代替vector.swap，所以那个地方应该加上注释：

1 2	`// Force vector to relinquish its memory (look up "STL swap trick")` `vector<float>().swap(data);`

说明可能陷阱

你在写代码的过程中，可能用到一些hack，或者有其他需要读代码的人知道的陷阱，这时候就应该注释：

1	`void` `SendEmail(string to, string subject, string body);`

而实际上这个发送邮件的函数是调用别的服务，有超时设置，所以需要注释：

1 2	`// Calls an external service to deliver email. (Times out after 1 minute.)` `void` `SendEmail(string to, string subject, string body);`

全景的注释

有时候为了更清楚说明，需要给整个文件加注释，让读者有个总体的概念：

1 2	`// This file contains helper functions that provide a more convenient interface to our` `// file system. It handles file permissions and other nitty-gritty details.`

总结性的注释

即使是在函数内部，也可以有类似文件注释那样的说明注释：

1

2

3

4

5

# Find all the items that customers purchased for themselves.
for customer_id in all_customers:
 for sale in all_sales[customer_id].sales:
 if sale.recipient == customer_id:
 ...

或者按照函数的步进，写一些注释：

def GenerateUserReport():
 # Acquire a lock for this user
 ...
 # Read user's info from the database
 ...
 # Write info to a file
 ...
 # Release the lock for this user

很多人不愿意写注释，确实，要写好注释也不是一件简单的事情，也可以在文件专门的地方，留个写注释的区域，可以写下你任何想说的东西。

注释应简明准确

前一个小节讨论了注释应该写什么，这一节来讨论应该怎么写，因为注释很重要，所以要写的精确，注释也占据屏幕空间，所以要简洁。

精简注释

1

2

3

4

// The int is the CategoryType.
// The first float in the inner pair is the 'score',
// the second is the 'weight'.
typedef hash_map<int, pair<float, float> > ScoreMap;

这样写太罗嗦了，尽量精简压缩成这样：

1 2	`// CategoryType -> (score, weight)` `typedef hash_map<int, pair<float,` `float> > ScoreMap;`

避免有歧义的代词

1	`// Insert the data into the cache, but check if it's too big first.`

这里的it's有歧义，不知道所指的是data还是cache，改成如下：

1	`// Insert the data into the cache, but check if the data is too big first.`

还有更好的解决办法，这里的it就有明确所指：

1	`// If the data is small enough, insert it into the cache.`

语句要精简准确

1	`# Depending on whether we've already crawled` `this` `URL before, give it a different priority.`

这句话理解起来太费劲，改成如下就好理解很多：

1	`# Give higher priority to URLs we've never crawled before.`

精确描述函数的目的

1 2	`// Return the number of lines in this file.` `int` `CountLines(string filename) { ... }`

这样的一个函数，用起来可能会一头雾水，因为他可以有很多歧义：

”” 一个空文件，是0行还是1行？
“hello” 只有一行，那么返回值是0还是1？
“hellon” 这种情况返回1还是2？
“hellon world” 返回1还是2？
“hellonr crueln worldr” 返回2、3、4哪一个呢？

所以注释应该这样写：

1 2	`// Count how many newline bytes ('n') are in the file.` `int` `CountLines(string filename) { ... }`

用实例说明边界情况

1

2

3

// Rearrange 'v' so that elements < <span class="wp_keywordlink"><a href="http://blog.jobbole.com/24057/" title="Pivot — 创业者最重要的本领" rel="nofollow" target="_blank">Pivot</a></span> come before those >= <span class="wp_keywordlink"><a href="http://blog.jobbole.com/24057/" title="Pivot — 创业者最重要的本领" rel="nofollow" target="_blank">Pivot</a></span>;
// Then return the largest 'i' for which v[i] < pivot (or -1 if none are < pivot)
int Partition(vector<int>* v, int pivot);

这个描述很精确，但是如果再加入一个例子，就更好了：

1

2

3

// ...
// Example: Partition([8 5 9 8 2], 8) might result in [5 2 | 8 9 8] and return 1
int Partition(vector<int>* v, int pivot);

说明你的代码的真正目的

void DisplayProducts(list<Product> products) {
 products.sort(CompareProductByPrice);
 // Iterate through the list in reverse order
 for (list<Product>::reverse_iterator it = products.rbegin(); it != products.rend();
 ++it)
 DisplayPrice(it->price);
 ... 
}

这里的注释说明了倒序排列，单还不够准确，应该改成这样：

1 2	`// Display each price, from highest to lowest` `for` `(list<Product>::reverse_iterator it = products.rbegin(); ... )`

函数调用时的注释

看见这样的一个函数调用，肯定会一头雾水：

1	`Connect(10,` `false);`

如果加上这样的注释，读起来就清楚多了：

1

2

3

4

def Connect(timeout, use_encryption): ...
 
# Call the function using named parameters
Connect(timeout = 10, use_encryption = False)

使用信息含量丰富的词

1

2

3

4

// This class contains a number of members that store the same information as in the
// database, but are stored here for speed. When this class is read from later, those
// members are checked first to see if they exist, and if so are returned; otherwise the
// database is read from and that data stored in those fields for next time.

上面这一大段注释，解释的很清楚，如果换一个词来代替，也不会有什么疑惑：

1	`// This class acts as a caching layer to the database.`

简化循环和逻辑

流程控制要简单

让条件语句、循环以及其他控制流程的代码尽可能自然，让读者在阅读过程中不需要停顿思考或者在回头查找，是这一节的目的。

条件语句中参数的位置

对比下面两种条件的写法：

1

2

3

4

5

if (length >= 10)
while (bytes_received < bytes_expected)
 
if (10 <= length)
while (bytes_expected > bytes_received)

到底是应该按照大于小于的顺序来呢，还是有其他的准则？是的，应该按照参数的意义来

运算符左边：通常是需要被检查的变量，也就是会经常变化的
运算符右边：通常是被比对的样本，一定程度上的常量

这就解释了为什么bytes_received < bytes_expected比反过来更好理解。

if/else的顺序

通常，if/else的顺序你可以自由选择，下面这两种都可以：

if (a == b) {
 // Case One ...
} else {
 // Case Two ...
}
 
if (a != b) {
 // Case Two ...
} else {
 // Case One ...
}

或许对此你也没有仔细斟酌过，但在有些时候，一种顺序确实好过另一种：

正向的逻辑在前，比如if(debug)就比if(!debug)好
简单逻辑的在前，这样if和else就可以在一个屏幕显示 – 有趣、清晰的逻辑在前

举个例子来看：

if (!url.HasQueryParameter("expand_all")) {
 response.Render(items);
 ...
} else {
 for (int i = 0; i < items.size(); i++) {
 items[i].Expand();
 }
 ... 
}

看到if你首先想到的是expand_all，就好像告诉你“不要想大象”，你会忍不住去想它，所以产生了一点点迷惑，最好写成：

if (url.HasQueryParameter("expand_all")) {
 for (int i = 0; i < items.size(); i++) {
 items[i].Expand();
 }
 ... 
} else {
 response.Render(items);
 ... 
}

三目运算符(?:)

time_str += (hour >= 12) ? "pm" : "am";
 
Avoiding the ternary operator, you might write:
 if (hour >= 12) {
 time_str += "pm";
 } else {
 time_str += "am";
}

使用三目运算符可以减少代码行数，上例就是一个很好的例证，但是我们的真正目的是减少读代码的时间，所以下面的情况并不适合用三目运算符：

1

2

3

4

5

6

7

return exponent >= 0 ? mantissa * (1 << exponent) : mantissa / (1 << -exponent);
 
if (exponent >= 0) {
 return mantissa * (1 << exponent);
} else {
 return mantissa / (1 << -exponent);
}

所以只在简单表达式的地方用。

避免使用do/while表达式

1

2

3

do {
 continue;
} while (false);

这段代码会执行几遍呢，需要时间思考一下，do/while完全可以用别的方法代替，所以应避免使用。

尽早return

1

2

3

4

5

public boolean Contains(String str, String substr) {
 if (str == null || substr == null) return false;
 if (substr.equals("")) return true;
 ...
}

函数里面尽早的return，可以让逻辑更加清晰。

减少嵌套

if (user_result == SUCCESS) {
 if (permission_result != SUCCESS) {
 reply.WriteErrors("error reading permissions");
 reply.Done();
 return;
 }
 reply.WriteErrors("");
} else {
 reply.WriteErrors(user_result);
}
reply.Done();

这样一段代码，有一层的嵌套，但是看起来也会稍有迷惑，想想自己的代码，有没有类似的情况呢？可以换个思路去考虑这段代码，并且用尽早return的原则修改，看起来就舒服很多：

if (user_result != SUCCESS) {
 reply.WriteErrors(user_result);
 reply.Done();
 return;
}
if (permission_result != SUCCESS) {
 reply.WriteErrors(permission_result);
 reply.Done();
 return;
}
reply.WriteErrors("");
reply.Done();

同样的，对于有嵌套的循环，可以采用同样的办法：

for (int i = 0; i < results.size(); i++) {
 if (results[i] != NULL) {
 non_null_count++;
 if (results[i]->name != "") {
 cout << "Considering candidate..." << endl;
 ...
 }
 }
}

换一种写法，尽早return，在循环中就用continue：

for (int i = 0; i < results.size(); i++) {
 if (results[i] == NULL) continue;
 non_null_count++;
 
 if (results[i]->name == "") continue;
 cout << "Considering candidate..." << endl;
 ... 
}

拆分复杂表达式

很显然的，越复杂的表达式，读起来越费劲，所以应该把那些复杂而庞大的表达式，拆分成一个个易于理解的小式子。

用变量

将复杂表达式拆分最简单的办法，就是增加一个变量：

1

2

3

4

5

6

if line.split(':')[0].strip() == "root":
 
//用变量替换
username = line.split(':')[0].strip() 
if username == "root":
 ...

或者这个例子：

if (request.user.id == document.owner_id) {
 // user can edit this document...
}
...
if (request.user.id != document.owner_id) {
// document is read-only...
}
 
//用变量替换
final boolean user_owns_document = (request.user.id == document.owner_id);
if (user_owns_document) {
 // user can edit this document...
}
...
if (!user_owns_document) {
 // document is read-only...
}

逻辑替换

1) not (a or b or c) <–> (not a) and (not b) and (not c)
2) not (a and b and c) <–> (not a) or (not b) or (not c)

所以，就可以这样写：

1

2

3

4

if (!(file_exists && !is_protected)) Error("Sorry, could not read file.");
 
//替换
if (!file_exists || is_protected) Error("Sorry, could not read file.");

不要滥用逻辑表达式

1	`assert((!(bucket = FindBucket(key))) \|\| !bucket->IsOccupied());`

这样的代码完全可以用下面这个替换，虽然有两行，但是更易懂：

1 2	`bucket = FindBucket(key);` `if` `(bucket != NULL)` `assert(!bucket->IsOccupied());`

像下面这样的表达式，最好也不要写，因为在有些语言中，x会被赋予第一个为true的变量的值：

1	`x = a \|\| b \|\| c`

拆解大表达式

var update_highlight = function (message_num) {
 if ($("#vote_value" + message_num).html() === "Up") {
 $("#thumbs_up" + message_num).addClass("highlighted");
 $("#thumbs_down" + message_num).removeClass("highlighted");
 } else if ($("#vote_value" + message_num).html() === "Down") {
 $("#thumbs_up" + message_num).removeClass("highlighted");
 $("#thumbs_down" + message_num).addClass("highlighted");
 } else {
 $("#thumbs_up" + message_num).removeClass("highighted");
 $("#thumbs_down" + message_num).removeClass("highlighted");
 }
};

这里面有很多重复的语句，我们可以用变量还替换简化：

var update_highlight = function (message_num) {
 var thumbs_up = $("#thumbs_up" + message_num);
 var thumbs_down = $("#thumbs_down" + message_num);
 var vote_value = $("#vote_value" + message_num).html();
 var hi = "highlighted";
 
 if (vote_value === "Up") {
 thumbs_up.addClass(hi);
 thumbs_down.removeClass(hi);
 } else if (vote_value === "Down") {
 thumbs_up.removeClass(hi);
 thumbs_down.addClass(hi);
 } else {
 thumbs_up.removeClass(hi);
 thumbs_down.removeClass(hi);
 }
}

变量与可读性

消除变量

前一节，讲到利用变量来拆解大表达式，这一节来讨论如何消除多余的变量。

没用的临时变量

1 2	`now = datetime.datetime.now()` `root_message.last_view_time = now`

这里的now可以去掉，因为：

并非用来拆分复杂的表达式
也没有增加可读性，因为`datetime.datetime.now()`本就清晰
只用了一次

所以完全可以写作：

1	`root_message.last_view_time = datetime.datetime.now()`

消除条件控制变量

boolean done = false;
while (/* condition */ && !done) {
 ...
 if (...) {
 done = true;
 continue; 
 }
}

这里的done可以用别的方式更好的完成：

1

2

3

4

5

6

while (/* condition */) {
 ...
 if (...) {
 break;
 } 
}

这个例子非常容易修改，如果是比较复杂的嵌套，break可能并不够用，这时候就可以把代码封装到函数中。

减少变量的作用域

我们都听过要避免使用全局变量这样的忠告，是的，当变量的作用域越大，就越难追踪，所以要保持变量小的作用域。

class LargeClass {
 string str_;
 void Method1() {
 str_ = ...;
 Method2();
 }
 void Method2() {
 // Uses str_
 }
 // Lots of other methods that don't use str_ 
 ... ;
}

这里的str_的作用域有些大，完全可以换一种方式：

class LargeClass {
 void Method1() {
 string str = ...;
 Method2(str); 
 }
 void Method2(string str) {
 // Uses str
 }
 // Now other methods can't see str.
};

将str通过变量函数参数传递，减小了作用域，也更易读。同样的道理也可以用在定义类的时候，将大类拆分成一个个小类。

不要使用嵌套的作用域

# No use of example_value up to this point.
if request:
 for value in request.values:
 if value > 0:
 example_value = value 
 break
 
for logger in debug.loggers:
 logger.log("Example:", example_value)

这个例子在运行时候会报example_value is undefined的错，修改起来不算难：

example_value = None
if request:
 for value in request.values:
 if value > 0: example_value = value 
 break
 
if example_value:
 for logger in debug.loggers:
 logger.log("Example:", example_value)

但是参考前面的消除中间变量准则，还有更好的办法：

def LogExample(value):
 for logger in debug.loggers:
 logger.log("Example:", value)
 
 if request:
 for value in request.values:
 if value > 0:
 LogExample(value) # deal with 'value' immediately
 break

用到了再声明

在C语言中，要求将所有的变量事先声明，这样当用到变量较多时候，读者处理这些信息就会有难度，所以一开始没用到的变量，就暂缓声明：

def ViewFilteredReplies(original_id):
 filtered_replies = []
 root_message = Messages.objects.get(original_id) 
 all_replies = Messages.objects.select(root_id=original_id)
 root_message.view_count += 1
 root_message.last_view_time = datetime.datetime.now()
 root_message.save()
 
 for reply in all_replies:
 if reply.spam_votes <= MAX_SPAM_VOTES:
 filtered_replies.append(reply)
 
 return filtered_replies

读者一次处理变量太多，可以暂缓声明：

def ViewFilteredReplies(original_id):
 root_message = Messages.objects.get(original_id)
 root_message.view_count += 1
 root_message.last_view_time = datetime.datetime.now()
 root_message.save()
 
 all_replies = Messages.objects.select(root_id=original_id) 
 filtered_replies = []
 for reply in all_replies:
 if reply.spam_votes <= MAX_SPAM_VOTES:
 filtered_replies.append(reply)
 
 return filtered_replies

变量最好只写一次

前面讨论了过多的变量会让读者迷惑，同一个变量，不停的被赋值也会让读者头晕，如果变量变化的次数少一些，代码可读性就更强。

一个例子

假设有一个页面，如下，需要给第一个空的input赋值：

<input type="text" id="input1" value="Dustin">
<input type="text" id="input2" value="Trevor">
<input type="text" id="input3" value="">
<input type="text" id="input4" value="Melissa">
...
var setFirstEmptyInput = function (new_value) {
 var found = false;
 var i = 1;
 var elem = document.getElementById('input' + i);
 while (elem !== null) {
 if (elem.value === '') {
 found = true;
 break; 
 }
 i++;
 elem = document.getElementById('input' + i);
 }
 if (found) elem.value = new_value;
 return elem;
};

这段代码能工作，有三个变量，我们逐一去看如何优化，found作为中间变量，完全可以消除：

var setFirstEmptyInput = function (new_value) {
 var i = 1;
 var elem = document.getElementById('input' + i);
 while (elem !== null) {
 if (elem.value === '') {
 elem.value = new_value;
 return elem;
 }
 i++;
 elem = document.getElementById('input' + i);
 }
 return null;
};

再来看elem变量，只用来做循环，调用了很多次，所以很难跟踪他的值，i也可以用for来修改：

var setFirstEmptyInput = function (new_value) {
 for (var i = 1; true; i++) {
 var elem = document.getElementById('input' + i);
 if (elem === null)
 return null; // Search Failed. No empty input found.
 if (elem.value === '') {
 elem.value = new_value;
 return elem;
 }
 }
};

重新组织你的代码

分离不相关的子问题

工程师就是将大问题分解为一个个小问题，然后逐个解决，这样也易于保证程序的健壮性、可读性。如何分解子问题，下面给出一些准则：

看看这个方法或代码，问问你自己“这段代码的最终目标是什么？”
对于每一行代码，要问“它与目标直接相关，或者是不相关的子问题？”
如果有足够多行的代码是处理与目标不直接相关的问题，那么抽离成子函数

来看一个例子：

ajax_post({
 url: 'http://example.com/submit',
 data: data,
 on_success: function (response_data) {
 var str = "{n";
 for (var key in response_data) {
 str += " " + key + " = " + response_data[key] + "n";
 }
 alert(str + "}");
 // Continue handling 'response_data' ...
 }
});

这段代码的目标是发送一个ajax请求，所以其中字符串处理的部分就可以抽离出来：

1

2

3

4

5

6

7

var format_pretty = function (obj) {
 var str = "{n";
 for (var key in obj) {
 str += " " + key + " = " + obj[key] + "n";
 }
 return str + "}";
};

意外收获

有很多理由将format_pretty抽离出来，这些独立的函数可以很容易的添加feature，增强可靠性，处理边界情况，等等。所以这里，可以将format_pretty增强，就会得到一个更强大的函数：

var format_pretty = function (obj, indent) {
 // Handle null, undefined, strings, and non-objects.
 if (obj === null) return "null";
 if (obj === undefined) return "undefined";
 if (typeof obj === "string") return '"' + obj + '"';
 if (typeof obj !== "object") return String(obj);
 if (indent === undefined) indent = "";
 
 // Handle (non-null) objects.
 
 var str = "{n";
 for (var key in obj) {
 str += indent + " " + key + " = ";
 str += format_pretty(obj[key], indent + " ") + "n"; }
 return str + indent + "}";
};

这个函数输出：

{
 key1 = 1
 key2 = true
 key3 = undefined
 key4 = null
 key5 = {
 key5a = {
 key5a1 = "hello world"
 }
 }
}

多做这样的事情，就是积累代码的过程，这样的代码可以复用，也可以形成自己的代码库，或者分享给别人。

业务相关的函数

那些与目标不相关函数，抽离出来可以复用，与业务相关的也可以抽出来，保持代码的易读性，例如：

business = Business()
business.name = request.POST["name"]
 
url_path_name = business.name.lower()
url_path_name = re.sub(r"['.]", "", url_path_name) 
url_path_name = re.sub(r"[^a-z0-9]+", "-", url_path_name) 
url_path_name = url_path_name.strip("-")
business.url = "/biz/" + url_path_name
 
business.date_created = datetime.datetime.utcnow() 
business.save_to_database()

抽离出来，就好看很多：

CHARS_TO_REMOVE = re.compile(r"['.']+")
CHARS_TO_DASH = re.compile(r"[^a-z0-9]+")
 
def make_url_friendly(text):
 text = text.lower()
 text = CHARS_TO_REMOVE.sub('', text) 
 text = CHARS_TO_DASH.sub('-', text) 
 return text.strip("-")
 
business = Business()
business.name = request.POST["name"]
business.url = "/biz/" + make_url_friendly(business.name) 
business.date_created = datetime.datetime.utcnow() 
business.save_to_database()

简化现有接口

我们来看一个读写cookie的函数：

var max_results;
var cookies = document.cookie.split(';');
for (var i = 0; i < cookies.length; i++) {
 var c = cookies[i];
 c = c.replace(/^[ ]+/, ''); // remove leading spaces
 if (c.indexOf("max_results=") === 0)
 max_results = Number(c.substring(12, c.length));
}

这段代码实在太丑了，理想的接口应该是这样的：

1 2	`set_cookie(name, value, days_to_expire);` `delete_cookie(name);`

对于并不理想的接口，你永远可以用自己的函数做封装，让接口更好用。

按自己需要写接口

1

2

3

4

5

6

7

ser_info = { "username": "...", "password": "..." }
user_str = json.dumps(user_info)
cipher = Cipher("aes_128_cbc", key=PRIVATE_KEY, init_vector=INIT_VECTOR, op=ENCODE)
encrypted_bytes = cipher.update(user_str)
encrypted_bytes += cipher.final() # flush out the current 128 bit block
url = "http://example.com/?user_info=" + base64.urlsafe_b64encode(encrypted_bytes)
...

虽然终极目的是拼接用户信息的字符，但是代码大部分做的事情是解析python的object，所以：

1

2

3

4

5

def url_safe_encrypt(obj):
 obj_str = json.dumps(obj)
 cipher = Cipher("aes_128_cbc", key=PRIVATE_KEY, init_vector=INIT_VECTOR, op=ENCODE) encrypted_bytes = cipher.update(obj_str)
 encrypted_bytes += cipher.final() # flush out the current 128 bit block
 return base64.urlsafe_b64encode(encrypted_bytes)

这样在其他地方也可以调用：

1 2	`user_info = {` `"username":` `"...",` `"password":` `"..."` `}` `url =` `"http://example.com/?user_info="` `+ url_safe_encrypt(user_info)`

分离子函数是好习惯，但是也要适度，过度的分离成多个小函数，也会让查找变得困难。

单任务

代码应该是一次只完成一个任务

var place = location_info["LocalityName"]; // e.g. "Santa Monica"
if (!place) {
 place = location_info["SubAdministrativeAreaName"]; // e.g. "Los Angeles"
}
if (!place) {
 place = location_info["AdministrativeAreaName"]; // e.g. "California"
}
if (!place) {
 place = "Middle-of-Nowhere";
}
if (location_info["CountryName"]) {
 place += ", " + location_info["CountryName"]; // e.g. "USA"
} else {
 place += ", Planet Earth";
}
 
return place;

这是一个用来拼地名的函数，有很多的条件判断，读起来非常吃力，有没有办法拆解任务呢？

1

2

3

4

var town = location_info["LocalityName"]; // e.g. "Santa Monica"
var city = location_info["SubAdministrativeAreaName"]; // e.g. "Los Angeles"
var state = location_info["AdministrativeAreaName"]; // e.g. "CA"
var country = location_info["CountryName"]; // e.g. "USA"

先拆解第一个任务，将各变量分别保存，这样在后面使用中不需要去记忆那些繁长的key值了，第二个任务，解决地址拼接的后半部分：

1

2

3

4

5

6

7

// Start with the default, and keep overwriting with the most specific value. var second_half = "Planet Earth";
if (country) {
 second_half = country; 
}
if (state && country === "USA") {
 second_half = state; 
}

再来解决前半部分：

var first_half = "Middle-of-Nowhere";
if (state && country !== "USA") {
 first_half = state; 
}
if (city) {
 first_half = city;
}
if (town) {
 first_half = town; 
}

大功告成：

1	`return` `first_half +` `", "` `+ second_half;`

如果注意到有USA这个变量的判断的话，也可以这样写：

var first_half, second_half;
if (country === "USA") {
 first_half = town || city || "Middle-of-Nowhere";
 second_half = state || "USA";
} else {
 first_half = town || city || state || "Middle-of-Nowhere";
 second_half = country || "Planet Earth";
}
return first_half + ", " + second_half;

把想法转换成代码

要把一个复杂的东西解释给别人，一些细节很容易就让人产生迷惑，所以想象把你的代码用平实的语言解释给别人听，别人是否能懂，有一些准则可以帮助你让代码更清晰：

用最平实的语言描述代码的目的，就像给读者讲述一样
注意描述中关键的字词
让你的代码符合你的描述

下面这段代码用来校验用户的权限：

$is_admin = is_admin_request();
if ($document) {
 if (!$is_admin && ($document['username'] != $_SESSION['username'])) {
 return not_authorized();
 }
} else {
 if (!$is_admin) {
 return not_authorized();
 } 
}
// continue rendering the page ...

这一段代码不长，里面的逻辑嵌套倒是复杂，参考前面章节所述，嵌套太多非常影响阅读理解，将这个逻辑用语言描述就是：

1

2

3

4

有两种情况有权限：
1、你是管理员(admin)
2、你拥有这个文档
否则就没有权限

根据描述来写代码：

if (is_admin_request()) {
 // authorized
} elseif ($document && ($document['username'] == $_SESSION['username'])) {
 // authorized
} else {
 return not_authorized();
}
// continue rendering the page ...

写更少的代码

最易懂的代码就是没有代码！

去掉那些没意义的feature，也不要过度设计
重新考虑需求，解决最简单的问题，也能完成整体的目标
熟悉你常用的库，周期性研究他的API

最后

还有一些与测试相关的章节，留给你自己去研读吧，再次推荐此书：

英文版：The Art of Readable Code
中文版：编写可读代码的艺术

标签: 代码, 可读, 编写, 艺术没有评论

271月/10关

电子邮件的艺术

发布在邵珠庆

产品经理没有行政意义上的权利和上下级隶属关系，要推动产品的发展，很大程度上靠协调能力和情商，鼓舞兄弟们认可一个共同的愿景，为产品成功而群策群力。电子邮件是现代企业，特别是互联网企业、外企最常用的沟通手段。电子邮件在外企职场，真的可以说是“运用之妙，存乎一心”，语气、措辞、发送、抄送，无数微妙之处都是考验一个产品经理是否称职，合格的产品经理应该懂得如何把最基本的沟通工具应用到极致和得心应手。

人与人之间的沟通方式包含了面对面，电话，电子邮件和即时消息，便条和信件，正式的报告和公告。上述几种方式能够起到的作用决定于下面三个特征：第一，确保双方信息交换同步进行；第二，确保信息的反馈快速双向；第三，确保沟通中根据对方实际情况进行信息传递。

越来越多的人无论在工作中还是在生活中都采用了电子邮件的方式从而取代了原先的书信往来。然而电子邮件所具有的快速和简便的自然属性同时也潜在地带来许多容易犯下的错误。从上面列举的三大特征中，我们可以看到电子邮件的沟通方式可能带来的弊端。而这些弊端引起的错误也许是无意的，但可能造成的损失却是巨大的，轻易地点击发送按钮有可能毁灭的是你的职业生涯。下面就谈谈电子邮件往来中什么是该做的，什么是不该做的。

该做的：

1．邮件主题一栏应该是整个邮件内容的概括。而太多的人在这一栏内往往空缺，也为以后的索引带来不便。

2．电子邮件内容应该简短，达意，建议用点句的形式，避免长篇大论。

3．邮件的开头应该用称呼语，同时在结尾要用适当的敬语。经常阅读到其他部门转发的邮件，从前面看起，发现高一级的经理给下属的邮件常常缺失称呼和敬语，但给更高一级的上级时却都没有遗漏，这种行为是为人所不齿的。有时候也收到其他部门员工发来的邮件要求就某事配合，只有表达内容的几句话，连个称呼和谢谢都没有，这种邮件极其没有礼貌，估计这样的员工如果提升为经理后就会出现只对上级尊重而对下属特拽的德性了。电子邮件其实是书信的一种替换，其作用，格式即使再怎么变，但人与人之间的尊重还是不能改变的。切记你的行为应该礼貌，尊重对方。

4．如果你的事情很紧急，应该在邮件中给出答复的时间底线，但要注意语气，没有人喜欢被命令，每个人都有自尊，尤其对于同级别的同事。

5．一定注意你写邮件时的心情，保持平和的心态，尤其是工作邮件，尤其是跨部门合作的邮件。

6．看到攻击你的邮件，最好的方式是拿起电话，或者面对面的进行沟通。这是你为人的宽容和豁达。

7．合理使用抄送，尤其是密件抄送。你是需要解决问题，但何必一定要抄送他或者你的上级呢？是否有种威胁的味道？

8．对于中国员工写英文邮件，避免使用生僻单词，有些词汇字典中的解释与中文一样，但原意却大相径庭。由于中文和英文的构造不同，中文是可以猜的，而英文你没看到过，就是不知道其中的含义，尤其对于老外，其学历很大程度上决定了其单词量的识别程度。

9．新员工入司后，尤其对于国际性公司，可能会与国外有沟通，请先让你的上司先写个邮件通知国外的联系者，然后你再联系。或者在你发邮件之前给你的上司过目一下。

10．打开语法和拼写检查的功能，语法错误容易让对方产生歧义，错别字的出现表明了你的态度认真与否。

11．电子邮件可以用来组织会议，概括会议内容，总结讨论的结果，然后给所有与会人员。

12．使用“离开办公室”的功能，让对方知道你的状态，比如出差，以便对方采用另外的沟通方式，避免引起误会。

13．尽量在同一天（不要隔夜）回复对方的邮件，以免对方苦苦等待，或者猜测。

14．在你点击发送邮件前请仔细，反复阅读你邮件的内容。

不该做的：

1．用电子邮件嘲笑你的上司或者责问你的同事。

2．对邮件的内容过于随意。除非你想让你邮件的内容在当天见诸报端或者电视，广播。

3．利用邮件展开一场战争或者让战争持续。在邮件中要避免传达个人的愤怒和感情，不要意图通过邮件一定要制对方于死地。一些重要的，复杂的，难以沟通的，牵涉个人的信息最好通过面对面的沟通或者电话。邮件战争在公司中司空见惯，满腔怒火，挖空心思，甚至用一天的时间就为了一个邮件是愚蠢的。记住，邮件一定要避免泼妇骂街，即使你认为自己很客观，但字里行间一定会透露出一种心情的，你的上司也是从坎坷中走过来的，他看你的邮件就如老师看小学生作弊一样的一目了然。最终的结果一般是各打五十大板。

4．将邮件发给全世界的人。这会将战争升级，矛盾公开化，这对你的职业前景极为不利，没有一个经理会愿意面对这样的下属的。

5．用一些不敬的语言。这会让彼此之间甚至周围的人都很难堪，同时也会对你所要表达的观点产生太多的误解。

当然，在我们日常的工作中经常也碰到如下两种的情况：

一种情况：你的上级从来不以邮件的方式告知他的决定，而代之以口头的方式，原因一般两种，一种是对其决定的事情没有自信，另一种是要你做的事情违反公司规定，道德行为或者法律。以免落下书面证据。

另一种情况：老外给你的邮件一般都是要求你当夜回复，如果没有回复，那么就是接二连三，甚至抄送你的上级，他的上级。而你一旦有急事找他，那么你一般会等到花儿也谢了。原因一般两种：一种是他在度假，尤其是欧洲人，另一种是每个老外都把自己当成你的上司。嗬嗬，浅见薄识，见笑了。

标签: 电子邮件, 艺术没有评论

支持博主

关于邵珠庆博客

文章标签

记事邵珠庆博客

1、Web前端网址导航
2、邵珠庆极客筛站
3、识别移动设备访问代码

竹磬网-了解慈善捐款计划

邵珠庆の日记 生命只有一次，你可以用它来做很多伟大的事情–Make the world a little better and easier

编写可读代码的艺术

发布在 邵珠庆

代码为什么要易于理解

如何命名

变量名中应包含更多信息

使用含义明确的词，比如用download而不是get，参考以下替换方案：

避免通用的词

使用具体的名字

增加重要的细节，比如变量的单位_ms，对原始字符串加_raw

对于作用域大的变量使用较长的名字

合理使用_、-等符号，比如对私有变量加_前缀。

命名不能有歧义

使用min、max代替limit

使用first和last来表示闭区间

使用beigin和end表示前闭后开(2,9))区间

Boolean型变量命名

符合预期

漂亮的格式

考虑断行的连续性和简洁

用函数封装

列对齐

代码用块区分

如何写注释

应该注释什么

不应该注释什么

不要像下面这样为了注释而注释：

不要给烂取名注释

记录你的想法

站在读者的角度去思考

说明可能陷阱

全景的注释

总结性的注释

注释应简明准确

精简注释

避免有歧义的代词

语句要精简准确

精确描述函数的目的

用实例说明边界情况

说明你的代码的真正目的

函数调用时的注释

使用信息含量丰富的词

简化循环和逻辑

流程控制要简单

条件语句中参数的位置

if/else的顺序

三目运算符(?:)

避免使用do/while表达式

尽早return

减少嵌套

拆分复杂表达式

用变量

逻辑替换

不要滥用逻辑表达式

拆解大表达式

变量与可读性

消除变量

没用的临时变量

消除条件控制变量

减少变量的作用域

不要使用嵌套的作用域

用到了再声明

变量最好只写一次

一个例子

重新组织你的代码

分离不相关的子问题

意外收获

业务相关的函数

简化现有接口

按自己需要写接口

单任务

把想法转换成代码

写更少的代码

最后

电子邮件的艺术

发布在 邵珠庆

支持博主

关于邵珠庆博客

文章标签

记事邵珠庆博客

邵珠庆の日记生命只有一次，你可以用它来做很多伟大的事情–Make the world a little better and easier

发布在邵珠庆

使用含义明确的词，比如用`download`而不是`get`，参考以下替换方案：

增加重要的细节，比如变量的单位`_ms`，对原始字符串加`_raw`

合理使用`_`、`-`等符号，比如对私有变量加`_`前缀。

使用`min`、`max`代替`limit`

使用`first`和`last`来表示闭区间

使用`beigin`和`end`表示前闭后开(2,9))区间

发布在邵珠庆