CLARITY AND ELEGANCE IN PROGRAMS by Gary T. Leavens Department of Computer Science, Iowa State University $Date: 1993/10/02 22:47:16 $ This note expands on the remarks about the clarity and elegance of programs in the course specification (/home/cs227/doc/course-spec.txt). That document says "You are encouraged to use helping procedures. "For programming problems, there is usually more than one right answer, hence it is important to write a clear solution. Answers to programming questions will be scored with the following breakdown of points: * "60% for correctness (whether your program solves the given problem correctly in the given subset of the language) * "40% for elegance and clarity (using relevant techniques to modularize the solution, making good use of language features, the creativity of your solution) "You should try to eliminate all syntax errors from your programs, as they affect both the correctness and clarity of your solution; you may receive no points for programs with major syntax errors. Checking of external inputs, and other conventions for dealing with humans are not important (for this class), unless the problem says so." The only explanation it gives of elegance and clarity is: "(using relevant techniques to modularize the solution, making good use of language features, the creativity of your solution)" So let me try to amplify these comments. 1. USING RELEVANT TECHNIQUES TO MODULARIZE THE SOLUTION. The modularization (or the modularity) of a solution means how the program is broken up into pieces. For the moment, this means using appropriate helping procedures. When we get to chapter 5, it means also using let and letrec where they are appropriate (as these help make definitions local, and thus increase the modularity by making it easier to separate parts of a problem). When we get to chapter 7, it will mean making use of higher-order patterns (built using lambda) to pull apart pieces of algorithms. In chapters 9 and 11 you will see how to use imperative features, but these are easily abused to make code confusing. We'll try to show some ways that they help to clarify that style of code (so called structured programming techniques). Helping procedures can be confusing, especially if you don't include comments stating what they are doing and what they do is not obvious. But at this stage of the game in Scheme it's hard to do that. There is also a notion of coupling of procedures that is relevant to this. If a procedure foo passes many arguments to a procedure bar, and if it expects to get a large, complex result back then it may be that foo and bar are too closely coupled. The solution to this is to have each procedure do a simple job, something that you can state in a single sentence. (Many also find procedures that do two or more jobs, typically as indicated by "flag" or boolean variables objectionable for the same reason.) 2. MAKING GOOD USE OF LANGUAGE FEATURES. There is a completely understandable tendency to use only a small subset of whatever language you are learning to express everything. However, good use of all the language features we learn should help make your programs clearer (at least to the TAs). I break this up into several aspects. 2.1. Eliminating redundant code Two pieces of code that are the same, and that serve the same purpose, make the program longer (probably harder to write and read), and certainly have a detrimental effect on the ease of changing the code. So I usually try to eliminate such codes. For example, the following has redundant code: (define insert-long-adjective (lambda (adj word sent) (if (null? sent) '() (if (equal? (car sent) word) (cons (car sent) (cons adj (cdr sent))) (cons (car sent) (insert-long-adjective adj word (cdr sent))))))) This can be fixed by using if as an expression: (define insert-long-adjective (lambda (adj word sent) (if (null? sent) '() (cons (car sent) (if (equal? (car sent) word) (cons adj (cdr sent)) (insert-long-adjective adj word (cdr sent))))))) Other examples would be using "and" or "or" to combine cases, using helping procedures or let/letrec (in chapter 5) to avoid writing down the same expression twice, and using lambda to capture common patterns (in chapter 7). 2.2. Brevity This is a point of some controversy, but often a shorter program is more desirable than a longer one. The controversy is that a shorter program uses more powerful constructs, and thus demands that the reader know more about the language. However, I think that in conjunction with learning the language it is almost always desirable to find ways to say things in a shorter manner. (The reason is that you should learn the language.) One example is the above insert-long-adjective. Others are elimination of #t and #f by using "and" and "or", using cond instead of deeply nested if's. One exception to brevity is in the names of procedures that are defined at the top-level. Here the name needs to convey something of what it is used for, and hence the longish names for procedures in our homework. The same applies to the names of global variables. I do prefer fairly short names for formal parameters, although they should say something; I don't usually use the variable name "x" unless the variable can be anything at all. 3. THE CREATIVITY OF YOUR SOLUTION The other way we use the word elegance in everyday speak is to describe something that is very clever. Usually this is a novel combination of ideas, and often (very) brief. However, the brevity may come into play only in the main procedure, and it may take some helping procedures to flesh out the main idea. Some examples: The following solution to last-adj-out (HW2, problem 17), by Chris Sauer, is (I think) creative: ; with the helping procedures rac, snoc, and rdc ; last-adj-out can be programmed like remove-1st (define last-adj-out (lambda (adj sent) (cond ((null? sent) '()) ((eq? (rac sent) adj) (rdc sent)) (else (snoc (last-adj-out adj (rdc sent)) (rac sent)))))) (define rac ;;; CAR backwards (lambda (ls) (if (null? (cdr ls)) (car ls) (rac (cdr ls))))) (define snoc ;;; CONS backwards (lambda (ls item) (if (null? ls) (cons item ls) (cons (car ls) (snoc (cdr ls) item))))) (define rdc ;;; CDR backwards (lambda (ls) (if (null? (cdr ls)) '() (cons (car ls) (rdc (cdr ls)))))) 4. CLARITY The basic guidelines of clarity are to * indent your code consistently and logically (e.g., use emacs to do it) * write your solutions in a straight-forward manner (like the examples we present in class), * to comment on any tricks or creative steps you make (i.e., those that are not like the ones used in class), * and to comment all helping procedures that are the least bit complex. For example, the above coding of last-adj-out had a comment describing the "tricky" idea of thinking of last-adj-out as remove-1st backwards. I like to use the following kinds of comments on (helping) procedures: TYPE comments (telling the type of the arguments and results and either ASSUMES/EFFECT or REQUIRES/ENSURES comments. Here is an example using TYPE, ASSUMES, and EFFECT comments (define acronym ; TYPE: (-> ((list (list symbol))) (list symbol)) (lambda (ls) ; ASSUME: no sublist of ls is empty, this allows the caar below ; EFFECT: return a list of the first items in each sublist of ls (if (null? ls) '() (cons (caar ls) (acronym (cdr ls)))))) The TYPE notation is explained in the "Make Your own Scheme Reference Card" handout. Note that the TYPE comment is attached to the name, but the ASSUME and EFFECT (or REQUIRES/ENSURES) comments come after the lambda. This allows the descriptions in the ASSUME/EFFECT to refer to the formal parameters by name. I like to give a simple imperative sentence for the EFFECT clause. It should say what the procedure does; think of writing down a command (do this, return that, etc.). If you need 2 sentences, or can't write one, think about redefining the procedure. Most EFFECT comments (until chapters 9 and 11) start with "return ...". For procedures without side-effects I usually use REQUIRES and ENSURES instead. For example, (define acronym ; TYPE: (-> ((list (list symbol))) (list symbol)) (lambda (ls) ; REQUIRES: no sublist of ls is empty ; ENSURES: result is a list of the first items ; in each sublist of ls (if (null? ls) '() (cons (caar ls) (acronym (cdr ls)))))) Note that the ENSURES comment starts with "result is", as is common, and describes the result (often passively). Some students like the more active form of the EFFECT style comments, but in the formal specification languages that I am designing, we use REQUIRES and ENSURES for technical reasons. (These are called pre- and post-conditions in the jargon.) Another way to document helping procedures that helps fix in your mind what they are to do is to write examples for them, just as we do in homeworks. I think this is an excellent way to document helping procedures, because it forces you to think about the problem they solve, and then you can solve it. ;; Examples: ;; (acronym '()) ==> () ;; (acronym '((c o n t e n t s) (a d d r e s s) (r e g i s t e r))) ;; ==> (c a r) ;; (acronym '((c o n t e n t s) (d e c r e m e n t) (r e g i s t e r))) ;; ==> (c d r) ;; (acronym '((l o t s) (i r r i t a t i n g) ;; (s i g n i f i c a n t) (p a r e n t h e s e s))) ;; ==> (l i s p) (define acronym (lambda (ls) (if (null? ls) '() (cons (caar ls) (acronym (cdr ls)))))) A variation on this would be to use equations. For example, (acronym '()) = '(). But I think doing that is a bit harder. However, another way to use equations would be to write a recursive definition, like the following. (define acronym ; TYPE: (-> ((list (list symbol))) (list symbol)) (lambda (ls) ; EQUATIONS: (acronym '()) = '() ; (acronym (cons x l)) = (cons (car x) (acronym l)) (if (null? ls) '() (cons (caar ls) (acronym (cdr ls)))))) However, this is difficult, and it seems, fairly redundant with the program! But it is a very good way of documenting the higher-order functions in chapter 7. Of course you can combine these techniques. Try experimenting with them and see which one you like best. Or invent new ones. You might try experimenting with writing Scheme keywords in upper case. For example, (DEFINE acronym (LAMBDA (ls) (IF (null? ls) '() (cons (caar ls) (acronym (cdr ls)))))) You can also use square brackets ([ and ]) to good effect in a Scheme that lets you (such as Chez Scheme and PC Scheme). However, remember that these are not portable. 5. MISCELLANY Some random tips on clarity. None of these are important. All of these can be violated for a good reason. 5.1. In an IF (or COND) it's best to have the shortest expression first. The reason is that the longer one tends to hide the sorter one. For example, (if (null? ls) '() (cons (caar ls) (acronym (cdr ls)))) is clearer than (if (pair? ls) (cons (caar ls) (acronym (cdr ls))) '()) 5.2. If you can avoid negatives, do so. That is, try not to ask questions using "not" if possible. For example, (if (pair? ls) (cons (caar ls) (acronym (cdr ls))) '()) is clearer than (if (not (null? ls)) (cons (caar ls) (acronym (cdr ls))) '()) But note that these are not quite opposites, so you have to think about what is correct first, then worry about this. More often this comes up in defining a problem for a helping procedure. For example, instead of making a procedure called "not-pair-or-null?" I choose to call it "atomic-item?", which is a positive way of putting it. 6. CLOSING REMARKS None of the above are laws of programming. You may break them for good reasons. It's more important to develop a sense of your own for style and elegance. Try doing it wrong to see what elegance and clarity mean! Remember also that in 227 you are not allowed to use language features that we haven't learned yet. Let me close by reminding you that you should think first about getting the program to run, and only later about elegance. However, you may need to clarify it to get it to run, especially carefully defining the helping procedures.