Unix程序设计艺术(The Art of Unix Programming)
McIlroy
- make each program do one thing well.to do a new job,build a fresh rather than complicate the old programs by adding new features.每个程序只需要做一件事情但是应该做好,尽可能的重新构造新的程序而不再原来的程序里面添加新功能
- expect the output of every program to become the input to another.as yet unknown,program.Don’t clutter output with extraneous information.Avoid stringently columnar of binary input formats.Don’t insist on interactive input.宽输入严输出,并且不要使用交互行为
- Design and build software,even operating systems,to be tried early,ideally within weeks.Don’t hesitate to throw away the clumsy parts and rebuild them(well,i can’t do this at least right now)尽可能早的开始设计和动手编写
- Use tools in preference to unskilled help to lighten a programming task,even if your have to detour to build the tools and expect to throw some of them out after you have finished using them.学习使用一些工具即使这个项目完成之后你也不需要它了
- this is the unix philosophy:write program that do one thing and do it well,write programs to work together.write programs to handle the text streams,because that is universal interface. 每个程序只做一件事情但是做好,并且只是处理text,因为纯文本才是通于的界面
Rob Pike
- you can’t tell where a program is going the spend its time.Bottlenecks occus in surprising places,so don’t try to second guess and put in a speed hack until you’ve proven that’s where the bottlenecks is.先测量程序找出瓶颈然后再考虑优化
- measure.don’t tune for speed until you’ve measured,and even then don’t unless one part of the code overwhelms the rest 优化之前进行测量
- fancy algorithms are slow when n is small,and n is usually small.fancy algorithms have big constants.until you know that n is frequently going to be big.don’t get fancy(even if n does get big,use the rule 2 first)尽管时间复杂度是一个好东西,但是永远别忘了常数因子。
- fancy algorithms are buggier than simple ones,and they’re much harder to implement.use simple algorithm as well as simple data structures.复杂的算法总是更容易出bug,所以尽可能在数据结构和算法设计上都简单一些
- data dominates.if you’ve chosen the right data structures and organized things well,the algorithms will almost be self-evident.data structures,not algorithms,are central to programming使用数据驱动而不是代码驱动,这样算法能够自表示
- there is no rule 6 前面5条就是全部
17 rules from eric raymond in the the art of unix programming
- rule of modularity:write simple parts connected by clean interfaces 模块接口
- rule of clarity:clarity is better than cleveness 清晰
- ruls of composition:design programs to be connected to the other programs 程序之间接口
- rule of separation:separate policy from mechnasim;separate interfaces from engines 分离
- rule of simplicity:design for simplicity;add complexity only where you must 简单
- rule of parsimony:write a big program only when it’s clear by demonstration that nothing else will do 节省
- rule transparency:design for visibility to make inspection and debugging easier 透明
- rule of robustness:robustness is the child of transparency and simplicity 健壮
- rule of representation:fold knowledge into data so program logic can be stupid and robust 数据驱动
- rule of least surprise:in the interface design,always do the least surprising thing 别让人吃惊
- rule of silence:when a program has nothing surprising to say,ut should say nothing 该沉默时就沉默
- rule of repair:when you must fail,fail noisily and as soon as possible 自修复
- rule of economy:programmer time is expensive;conserver it in preference to machine time 人的时间比机器时间宝贵
- rulf of generation:avoid hand-hacking;write programs to write programs when you can(well,i think it’s right,there is many useful programs in unix generating programs for you like yacc(bison),lex(flex),twig,texinfo,ect) 让机器帮助你写程序
- rule of optimization:prototype before polishing.get it working before you optimize it 优化的准则
- rule of diversity:distrust all claims for “one true way” 多样
- rulf of extensibility:design for the future,because it will be here sooner than you think. 扩展
Unix programmers tend to be good at writing references,and most Unix documentation has the flavor of a reference or aide memoire for someone who thinks like the document-writer but is not yet an expert at his or her software.The results often look much more cryptic and sparse than they actuallt are.Read every word carefully,because whatever you want to know will probably be there,or deducible from what’s there.Read every word carefully,because you will seldom be told anything twice.
Unix程序员大都是这些手册的编写者,因此对于入门或者是刚刚使用这个软件的人,你需要仔细读每一句,因为如果不仔细阅读的话后面就不会再提到了:-)
Best Pactices For Writing Unix Documentation
- When your write documentation for people within the Unix curlture,don’t dumb it down.If you write as if for idiots,you will be written off as an idiot yourself.Dumbing documnetation down is very different from makeing it accessible.The former is lazy and moits important things,where as the latter requires careful thought and ruthless editing
- Don’t think for a moment that volume will be mistaken for quality.And especially,never ever omit
functional details because you frear they might be confusing,nor warnings about problems because you don’t wnat to look bad.It’s unanticipated problems that will cost you credibility and users,not the prblems you wew honest about. 一点就是永远不要把文档写成给idiot看的,易懂和这种事由很大分别的. 二点就是需要将所有的功能全部写清楚,即使这样看上去不好,但是这是你的honesty,而且能够让用户能清楚地了解现在软件所能够提供的功能
- Good Documentation is usually the most visible sign of what separates a solid contribution from a quick and dirty hack.If you have the time and care necessary to produce it,you will find you’are already 85% of the way to having your patch accepted by most developers. 对于文档的态度,好的文档立刻就和差的东西区分开来,所以如果一旦编写了好的文档,那么85%的成功已经到手了:-)
release early,release often.a rapid release tempo means quick and effective feedback,when each increamental release is small,changing course in repsonse to read-world feedback is eaiser
尽可能的缩短发布的时间并且尽可能的迅速反馈:-)
Unix Interface Design Patterns Unix接口的设计模式(这个东西教会我很多:-),重点推荐)
- The Filter Pattern 这种过滤器模式,Text->Filter->Text,格式需要尽量的统一,采用标准输入和输出
- The Cantrip Pattern没有任何输入输出的,但是有一定的特定动作执行
- The Source Pattern这是模式没有任何输入,只存在输出的模式
- The Sink Pattern这种模式是制进行输出的或者是不需要输入文件的模式
- The Compiler Pattern从命令行中指定配置的参数,然后从文件中输入向文件输出,这里面指通过命令行进行一些选项的开关是至关重要的
- The ed Pattern这种事一种交互式的操作,输入一个键值能执行特定的操作并且返回特定的执行信息
- The Rogue Pattern 这也是一种交互式的的操作,输入一个键值但是能从Character Cell界面上看到对应的效果。这种比GUI好的方式在如果只是传送Character Cell数据更小
- The ‘Separated Engine and Interface’ Pattern分离的引擎和界面,但是对于下面还有更细的划分,
- Configurator/Actor Pair存在一个编写配置程序和执行这个配置的程序,将interface的内容写在config文件在中然后执行
- Spooler/Darmon Pair类似于消费者和生产者模型,对于批量式是很有用的
- Driver/Engine Pair这种可以通过提供多种UI方式的Driver来操作Driver,是一种非常理想的方式,GIMP实现的方式
- Client/Server Pair这种模式就不说了CS模式
- CLI Server这个没有看懂,书上面说是针对于POP,IMAP协议
- Language-Based Interface Pairl这种模式也是非常通用的而且超强大,需要图利编写一门交互式的语言,最好还是选用Scheme,现在又一些实现比如GNU的guile或者是Emacs中那样使用lisp
polyvalent program pattern这个是针对一个程序提供多种开发方式,理想的方式是
- Xuers -> graphical user inteface -> service library
- termial users -> command line interface -> service library
- scripts -> scipting interface -> service library
最终都是通过service library来提供原始的服务
spend your time on design quality,not the low-level details,and automate away everything you can-including the detail work of runtime debugging. 这也是我追求的目标,追求的应该是设计质量。我不管我是不是软工还是高工,是架构师还是代码工,我所需要关心的是设计,小到模块大到整体设计
Reinventing the wheel is bad not only because is wastes time,but because reinvernted wheels are often square.There is an almost iresstible temptation to economize on reinvention time by taking a shortcut to a crude and poorly-thought-out version,which in the long run often turns out to be false economy.
在这章里面eric举了一个j.random.newbie的新手的例子,说明一个程序员为什么喜欢reinvent the wheel。而且在公司开发的背后程序员造轮子的原因也是可以理解的,但是这并不表明早轮子就可以接受。reuse并不是意味着差代码只能被修改而不能重写因为我们需要重用。重用的关键在于transparency。这就是open source关键所在。open source you can get the source code你能够去修改source code来满足你的要求,如果你没有这种打算都能够让source code run起来,能够做一些适合自己的修改,这也就是重用的关键
read before you write,develop the habit of reading code.There are seldom any completely new problems,so it’s almose always possible to discover code that’s close enough to what you need to be a good starting point.Even when your problem is genuinely novel,it’s likely to be genetically related to a problem someone else has solved before,so the solution you need to develop is likely to be related to some existing one as well.
我们一般不回遇到很多全新的问题,很多问题别人已经解决了,关键问题就是如何整合这些方案,reading source code就是最好的办法,即使是一个新手,你也能够从源代码中看到别人是如何定义这个问题的。恩,现在觉得清晰的定义好问题时非常重要的,看看别人的代码就知道别人是怎么定义问题,怎么在解决这些问题
Tradeoffs between interface and implementation complexity. 这里提到了两种复杂性的哲学,一种是MIT Philosophy,另外一种是New Jersey Philosophy. 第一种哲学的强调尽量的让接口简单,而第二种哲学强调尽量让内部实现简单. 典型的例子就是关于Sys V和BSD 的信号处理机制, Sys V强调的就是New Jersey的风格,就是一个信号函数需要不断的更新,这样实现看上去很难看,但是内部实现就稍微简单一些,BSD强调的就是MIT风格,对于信号函数的 注册只是用一个接口实现,是用起来很舒服. 但是还是Eric Raymond说的对,We can’t offer one-size-fits-all answer.The important thing is to develop the habit of thinking carefully about this issue on each and every on of your designs. Complexity is a cost you must budget very carefully.
书里面提到的complexity包括三种essential complexity,accidental complexity,and optional complexity.
- 对于第一种complexity肯定是不能够回避的,这个就是关键的问题。就像是书里面提到的,对于编写一个飞机航线的程序不可能只用10行搞定,这个复杂度是不可避免的
- 对于accidental complexity是因为没有找到好的设计,就像是在做的这个项目直到快完工时才发现一种更好的解决方案。accidental complexity happens because someone didn’t find the simplest way to implement a specified set of features.Accidental Complexity can be eliminated by good design,or good redesign
- 对于optional complexity是因为需要加一些亮丽的特性,这个问题可以通过降低objective来解决 optional complexity on the other hand,is tied to some desiable feature.Optional Complexity can be eliminated only by changing the projects’ objectives
最后总结就是很经典的话,如何区分accidental和optional complexity关系到设计的结果,对于objectives的选择关系到程序的简洁并且关系到负责这个项目的人是否聪明
All tend to evolve in accordance with the Law of Software Envelopment,aka Zawinski’s Law:”Every program,attempts to expand until it can read mail.Those programs which cannnot so expand are replaced to by ones which can” To the extend Zawinskis’s Law is correct,it suguests that some things wnat to be small and some want to be large,but the middile ground is unstable.对于一个软件要不就非常大,要不就非常小。
关于框架Framework. There is a hidden dual of the Unix gospel of small sharp tools,a background so implicit that many Unix practitioners don’t notice it,any more than fish notice the water they swim in.This is the presence of FRAMEWORK!!!!! Small Sharp tools in the Unix style have trouble sharing data,unless they live indisde a framework,that makes communication amony they easy.Emacs is such a framwork,and unified management of shared context is what the optional complexity of emacs is buying. In old-school Unix,the only framework was pipelines,redirection and the shell,the integration was done with scripts,and the shared context was (essentially) the file system itself.But that was no the end of evolution.Emacs Unifies the file system with a world of text buffers and helper subprocesses,largely leaving the shell framework behind.Modern desktop environments provide a communication framework for GUIs,also leaving the shell framework behind.Each framework has strengths and weaknesses of its own.Frameworks become homes to ecologies of tools-the shell to shellscripts,emacs to lisp codes,and desktop envieoments to flocks of GUIs.
上面的内容说到了框架出现的原因,框架的出现就是为了整合好各种工具,让他们有一个统一的平台发挥好他们的作用EMACS是一个framework,里面 的各种工具是lisp 编写的,Shell是一个frameworks,里面工具是各种shellscripts对于桌面系统也是一个framework,里面各种工具是 GUIs的程序.对于一个框架都提供shared data context这是我的理解,就是要提供一个平台能够同享数据。我在这里想到的也就是后面Eric所提到的,在framework里面永远不要嵌入 policy而应该仅仅提供mechanism这样每个工具才能更好的发挥自己的空间.原来framework这么也是这么需要的
There is a lesson here for amibitious system architects:the most dangerous enemy of a better solution is an existing codebase that’s just good enough.
作者讨论到Plan 9这个强大的操作系统,但是之后分析为什么没有成功,这里面有一些原因值得学习。一些人可能会说缺少正确的市场策略,还有详细的文档,并且费用和license都是不明确的。但是作者认为既然Plan 9是Unix纯正的后裔,这些都不是什么问题,因为Unix也是一样从AT&T labs发展出来的,而且之前也没有更多的文档和市场策略。作者认虽然Unix有这样和那样的不适,但是现在Unix工作良好,所以Plan 9可以说是没有任何机会的(这就是原文的exsiting codebase)
We can turn aisde from this:we can remain a priesthood appealing to a select minority of the best and brightes,a geek meritocracy focused on out historical role as the keepers of the software infrastructure and the networks.But if we do this,we will very likely go into decline and eventually lose the dynamism that has sustained us through decades.Some one else will serve the people,somene else will put themselves where the power and the money are,and own the future if 92% of all software.The odds are,whether that someone else is M$ or not, that they will do t using practices and software we don’t much like
作者认为Unix中存在的问题就是精英文化。这是作者在Mactonish Developer conference2000上发现的。Mac的开发者都是围绕用户体验而开发的,但是Unix开发者尽量考虑的是infrastructure。两种文化都相互认为对方是mal-design。但是作者认为为了争取那92%的non-technical users,Unix culture需要关注要用户的体验了,更进一步的说,是要去吸收和接受其他community的设计方案。这才是以后Unix文法的发展方向
THE IETF traidition reinforced this by teaching us to think of code as secondary to standards.Standards are what enbale programmers to cooperate,they knit our techonologies into wholes that are more than the sum of the parts.
In X,the specification has always ruled.Sometimes specs have bugs that need to be fixed too,but code is ussually buggier than code.Haveing a well-considered specification driving development allows for litte argument above bug vs.feature;a system which incorrectly implements the specification is broken ans should be fixed.I suspect this is so ingrained into most of us that we lose sight of its power.
the (re)invention of open source has has a significant impact on the standards process as well.Though it’s not formally a requirement,the IETF has since around 1997 grown increasingly resistant to standard-tracking RFCS that don’t have at least on open-source reference implementation.In the future,it seems likely that conformance to any given standard will increasingly be measured by conformace to (or outright use of) open-source implementations that have been blessed by standards’ authors. The flip side of this is that oftern the best way to make somthing a standard s to distribute a high-qualify open-source implementation of it
规范只是一个DNA,我们允许在DNA上面进行扩展,但是关键部分还是需要坚持规范。